Introduction

This notebook will be dedicated to exploring details of the PISA 2012 dataset. PISA, in particular, is a "survey of students' skills and knowledge as they approach the end of compulsory education. It is not a conventional school test. Rather than examining how well students have learned the school curriculum, it looks at how well prepared they are for life beyond school" (Udacity, 2019).

Within this datset we can find information for about 510,000 students. The PISA 2012 dataset includes information on mathematics, reading in the test language, and science.

Throughout the course of this notebook I will have these two questions in mind:

  • Are there differences in achievement based on gender or parental education levels?
  • Is there a relationship between the amount of time a student dedicates to learning and their score?

Preliminary Wrangling

To begin, let's start off by assessing the dataset and cleaning any remaining issues.

In [3]:
# Import all packages and set plots to be embedded inline
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sb

%matplotlib inline
In [4]:
# Read in the cleaned csv that was created in the wrangle_pisa notebook
pisa = pd.read_csv('pisa_df.csv')
In [5]:
# Set up variables for colors to be used in plotting
color1 = '#a7d7c5'
color2 = '#74b49b'
color3 = '#5c8d89'
color_male = '#ff8162' 
color_female = '#ffcd60'
color_gends = ['#ffcd60', '#ff8162']
line = '#ff8000'

Assessing and Cleaning the Data

General

In [6]:
# How many rows and variables the dataset holds
pisa.shape
Out[6]:
(15167, 19)
In [7]:
# What are the data types of the variables
pisa.dtypes
Out[7]:
Country                                           object
Student ID                                         int64
Gender                                            object
Out-of-School Study Time - Homework              float64
Out-of-School Study Time - Guided Homework       float64
Out-of-School Study Time - Personal Tutor        float64
Out-of-School Study Time - Commercial Company    float64
Out-of-School Study Time - With Parent           float64
Learning Time - Mathematics                      float64
Learning Time - Test Language                    float64
Learning Time - Science                          float64
Average Math Score                               float64
Average Reading Score                            float64
Average Science Score                            float64
Average Total Score                              float64
Education - Father                                object
Education - Mother                                object
Out-of-School Study Time - Total                 float64
Learning Time - Total                            float64
dtype: object
In [8]:
# See 10 examples of data in the dataset 
pisa.sample(10)
Out[8]:
Country Student ID Gender Out-of-School Study Time - Homework Out-of-School Study Time - Guided Homework Out-of-School Study Time - Personal Tutor Out-of-School Study Time - Commercial Company Out-of-School Study Time - With Parent Learning Time - Mathematics Learning Time - Test Language Learning Time - Science Average Math Score Average Reading Score Average Science Score Average Total Score Education - Father Education - Mother Out-of-School Study Time - Total Learning Time - Total
6588 Italy 25326 Female 21.0 0.0 0.0 0.0 0.0 300.0 240.0 120.0 623.23570 617.65948 613.42784 618.107673 Post-secondary Short-cycle tertiary 21.0 660.0
613 Qatar 4956 Male 3.0 3.0 3.0 3.0 3.0 275.0 275.0 275.0 293.97880 304.07932 316.43042 304.829513 Bachelor’s or equivalent Bachelor’s or equivalent 15.0 825.0
14031 Italy 25903 Male 10.0 0.0 0.0 0.0 0.0 240.0 240.0 300.0 427.02134 422.92756 470.66392 440.204273 Primary Lower secondary 10.0 780.0
14552 Italy 29595 Male 20.0 0.0 0.0 0.0 0.0 275.0 220.0 110.0 668.95932 652.76504 616.31854 646.014300 Post-secondary Primary 20.0 605.0
11667 Mexico 30213 Female 5.0 1.0 1.0 1.0 1.0 250.0 150.0 150.0 392.20284 417.49336 424.50580 411.400667 Short-cycle tertiary Upper secondary 9.0 550.0
8505 Mexico 22261 Female 2.0 2.0 2.0 3.0 3.0 250.0 150.0 250.0 393.21544 444.34104 384.50206 407.352847 Short-cycle tertiary Short-cycle tertiary 12.0 650.0
4731 Spain 14949 Female 2.0 2.0 2.0 0.0 1.0 180.0 180.0 180.0 437.38122 457.36770 431.21970 441.989540 Bachelor’s or equivalent Short-cycle tertiary 7.0 540.0
3190 Qatar 9805 Female 3.0 0.0 0.0 0.0 1.0 220.0 220.0 440.0 604.46332 578.10286 566.71020 583.092127 Bachelor’s or equivalent Short-cycle tertiary 4.0 880.0
3359 Canada 17419 Male 6.0 0.0 0.0 0.0 0.0 330.0 330.0 330.0 495.80152 508.65552 585.73294 530.063327 Bachelor’s or equivalent Short-cycle tertiary 6.0 990.0
5849 Mexico 14764 Female 2.0 1.0 0.0 0.0 0.0 240.0 240.0 240.0 371.24942 422.25922 394.38644 395.965027 Short-cycle tertiary Short-cycle tertiary 3.0 720.0
In [9]:
# Decriptive statistics for each numeric variable
pisa.describe()
Out[9]:
Student ID Out-of-School Study Time - Homework Out-of-School Study Time - Guided Homework Out-of-School Study Time - Personal Tutor Out-of-School Study Time - Commercial Company Out-of-School Study Time - With Parent Learning Time - Mathematics Learning Time - Test Language Learning Time - Science Average Math Score Average Reading Score Average Science Score Average Total Score Out-of-School Study Time - Total Learning Time - Total
count 15167.000000 15167.000000 15167.000000 15167.000000 15167.000000 15167.000000 15167.000000 15167.000000 15167.000000 15167.000000 15167.000000 15167.000000 15167.000000 15167.000000 15167.000000
mean 16988.039626 6.673370 1.608031 0.825674 0.688271 1.093624 246.463374 242.097119 228.560823 503.980473 506.784609 508.320475 506.361852 10.888969 717.121316
std 9561.527074 5.791363 2.485229 1.946453 1.911433 2.073632 90.533247 94.357716 133.939948 90.423852 87.835595 88.778732 85.568534 9.083546 252.006634
min 2.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 199.104220 143.690300 144.479680 162.424733 0.000000 0.000000
25% 9031.500000 2.000000 0.000000 0.000000 0.000000 0.000000 200.000000 200.000000 135.000000 438.627520 447.627480 445.300260 445.497293 5.000000 550.000000
50% 17275.000000 5.000000 1.000000 0.000000 0.000000 0.000000 240.000000 225.000000 200.000000 503.201440 510.427600 509.175700 508.154200 9.000000 660.000000
75% 25033.500000 9.000000 2.000000 1.000000 0.000000 1.000000 285.000000 275.000000 300.000000 568.125880 569.683140 573.237600 568.191293 14.000000 810.000000
max 33806.000000 30.000000 30.000000 30.000000 30.000000 30.000000 1440.000000 1800.000000 1920.000000 796.627180 790.138200 834.800440 767.428960 122.000000 3000.000000

Parental Education

In [10]:
# The type and quantity of the educational levels for 'Education - Father'
pisa['Education - Father'].value_counts()
Out[10]:
Short-cycle tertiary        6509
Bachelor’s or equivalent    5298
Upper secondary             1244
Post-secondary               900
Lower secondary              881
Primary                      232
Early childhood              103
Name: Education - Father, dtype: int64
In [11]:
# The type and quantity of the educational levels for 'Education - Mother'
pisa['Education - Mother'].value_counts()
Out[11]:
Short-cycle tertiary        7086
Upper secondary             2338
Bachelor’s or equivalent    2303
Lower secondary             1582
Post-secondary              1150
Primary                      489
Early childhood              219
Name: Education - Mother, dtype: int64
In [12]:
# Convert parental level of education into ordered categorical types
ordinal_var_dict = {'Education - Father': ['Early childhood', 'Primary', 'Lower secondary', 'Upper secondary', 'Post-secondary', 'Short-cycle tertiary', 'Bachelor’s or equivalent'],
                    'Education - Mother': ['Early childhood', 'Primary', 'Lower secondary', 'Upper secondary', 'Post-secondary', 'Short-cycle tertiary', 'Bachelor’s or equivalent']}

for var in ordinal_var_dict:
    ordered_var = pd.api.types.CategoricalDtype(ordered = True,
                                                categories = ordinal_var_dict[var])
    pisa[var] = pisa[var].astype(ordered_var)
In [17]:
pisa.shape
Out[17]:
(15167, 19)
In [18]:
pisa['Student ID'].duplicated().sum()
Out[18]:
0
In [19]:
pisa.drop_duplicates(inplace=True)
In [20]:
pisa.duplicated().sum()
Out[20]:
0
In [21]:
pisa.shape
Out[21]:
(15167, 19)

The structure of the dataset

This cleaned version of the Pisa dataset from 2012 is composed of 43,715 rows, each of which represents one student. As for the features of this dataset, there are 18 variables that have been selected, most of which are numeric. Two of the variables are different however in that they are ordered categorical variables. They are the highest educational levels of the mother and father of the student, and are sorted from lowest level of education to highest level:

(least educated) —> (most educated)
<ISCED level 0> : Pre-primary education
<ISCED level 1> : Primary education or first stage of basic education
<ISCED level 2> : Lower secondary education or second stage of basic education
<ISCED level 3> : Upper secondary education
<ISCED level 4> : Post-secondary non-tertiary education
<ISCED level 5> : First stage of tertiary education
<ISCED level 6> : Second stage of tertiary education

Main feature of interest in the dataset

The main feature that we will be exploring is the 'Average Total Score'.

Features that will support the investigation into 'Average Total Score'

To better understand the Average Total Score, I believe that 'Out-of-School Study Time - Total' and 'Learning time (minutes per week) - Total' will provide illuminating results. The average assumption is that the more homework a student completes, the better they will perform when writing tests, but there has been a recent uprise in research that explains that it is not a good predictor of test success. Rather, I expect that the educational level of the parents, and the amount of books that they have in their home will be a better feature to predict the student's test related success.

Univariate Exploration

We can start off by looking at the main feature of interest: the average total score.

In particular, let's first look at a standard-scale plot of this value to see its distribution.

In [22]:
# Histogram of Average Total Score
binsize = 20
bins = np.arange(0, pisa['Average Total Score'].max()+binsize, binsize)
plt.figure(figsize=[8, 5])
plt.hist(data = pisa, x = 'Average Total Score', bins = bins, color = color1)
plt.xlabel('Average Total Score')
plt.ylabel('Frequency')
plt.title('Frequency of Average Total Scores');

Here we can see that it is a very normal distribution. This is generally not surprising since bell curves are expected when it comes to the grades of students.

We can now move onto the three scores that the total score is comprised of: Math, Reading, and Science.

In [23]:
# Histogram of Average Math Score
binsize = 20
bins = np.arange(0, pisa['Average Math Score'].max()+binsize, binsize)
plt.figure(figsize=[8, 5])
plt.hist(data = pisa, x = 'Average Math Score', bins=bins, color = color1)
plt.xlabel('Average Math Score')
plt.ylabel('Frequency')
plt.title('Frequency of Average Math Scores');

We can easily say that this distribution is very much so like the total score in that it has a distinct normal distribution.

In [24]:
# Histogram of Average Reading Score
binsize = 20
bins = np.arange(0, pisa['Average Reading Score'].max()+binsize, binsize)
plt.figure(figsize=[8, 5])
plt.hist(data = pisa, x = 'Average Reading Score', bins=bins, color = color1)
plt.xlabel('Average Reading Score')
plt.ylabel('Frequency')
plt.title('Frequency of Average Reading Scores');

Just as with the Math score, we can see the average Reading score is falling along a normal distribution.

In [25]:
# Histogram of Average Science Score
binsize = 20
bins = np.arange(0, pisa['Average Science Score'].max()+binsize, binsize)
plt.figure(figsize=[8, 5])
plt.hist(data = pisa, x = 'Average Science Score', bins=bins, color = color1)
plt.xlabel('Average Science Score')
plt.ylabel('Frequency')
plt.title('Frequency of Average Science Scores');

Just as with the Total, Math, and Reading scores, we can see the Science score also falls along a normal distribution.

We can now move onto the Study Time variables.

In [26]:
# Histogram of the Total Out-of-School Study Time
binsize = 2
bins = np.arange(0, pisa['Out-of-School Study Time - Total'].max()+binsize, binsize)

plt.figure(figsize=[8, 5])
plt.hist(data = pisa, x = 'Out-of-School Study Time - Total', color = color2, bins = bins)

plt.xlabel('Out-of-School Study Time - Total (h/week)')
plt.ylabel('Frequency')
plt.title('Frequency of Total Out-of-School Study Times');

From this histogram for the Total Out-of-School Study Time, we can see a strong right skew on this unimodal distribution. Due to the tail that extends past the peak, we should look at this variable on a smaller scale.

In [82]:
# Histogram of the Total Out-of-School Study Time
binsize = 1
bins = np.arange(0, pisa['Out-of-School Study Time - Total'].max()+binsize, binsize)

plt.figure(figsize=[8, 5])
plt.hist(data = pisa, x = 'Out-of-School Study Time - Total', color = color2, bins = bins)
plt.xlim(0,20)
plt.xlabel('Out-of-School Study Time - Total (h/week)')
plt.ylabel('Frequency')
plt.title('Frequency of Total Out-of-School Study Times');

The data for this distribution remains unimodal and quite consistent under a smaller scale.

Now we can look at each of the variables that have been used to create the Total Out-of-School Study Time: 'Out-of-School Study Time - Homework', 'Out-of-School Study Time - Guided Homework', 'Out-of-School Study Time - Personal Tutor', 'Out-of-School Study Time - Commercial Company', 'Out-of-School Study Time - With Parent'

In [30]:
# Histogram of the Out-of-School Study Time for Homework
binsize = 1
bins = np.arange(0, pisa['Out-of-School Study Time - Homework'].max()+binsize, binsize)

plt.figure(figsize=[8, 5])
plt.hist(data = pisa, x = 'Out-of-School Study Time - Homework', color = color2, bins = bins)

plt.xlabel('Out-of-School Study Time - Homework (h/week)')
plt.ylabel('Frequency')
plt.title('Frequency of Out-of-School Study Times for Homework');
In [31]:
# Histogram of the Out-of-School Study Time for Guided Homework
binsize = 1
bins = np.arange(0, pisa['Out-of-School Study Time - Guided Homework'].max()+binsize, binsize)

plt.figure(figsize=[8, 5])
plt.hist(data = pisa, x = 'Out-of-School Study Time - Guided Homework', color = color2, bins = bins)

plt.xlabel('Out-of-School Study Time - Guided Homework (h/week)')
plt.ylabel('Frequency')
plt.title('Frequency of Out-of-School Study Times for Guided Homework');
In [32]:
# Histogram of the Out-of-School Study Time with a Personal Tutor
binsize = 1
bins = np.arange(0, pisa['Out-of-School Study Time - Personal Tutor'].max()+binsize, binsize)

plt.figure(figsize=[8, 5])
plt.hist(data = pisa, x = 'Out-of-School Study Time - Personal Tutor', color = color2, bins = bins)

plt.xlabel('Out-of-School Study Time - Personal Tutor (h/week)')
plt.ylabel('Frequency')
plt.title('Frequency of Out-of-School Study Times with a Personal Tutor');
In [33]:
# Histogram of the Out-of-School Study Time with a Commercial Company
binsize = 1
bins = np.arange(0, pisa['Out-of-School Study Time - Commercial Company'].max()+binsize, binsize)

plt.figure(figsize=[8, 5])
plt.hist(data = pisa, x = 'Out-of-School Study Time - Commercial Company', color = color2, bins = bins)

plt.xlabel('Out-of-School Study Time - Commercial Company (h/week)')
plt.ylabel('Frequency')
plt.title('Frequency of Out-of-School Study Times with a Commercial Company');
In [34]:
# Histogram of the Out-of-School Study Time with a Parent
binsize = 1
bins = np.arange(0, pisa['Out-of-School Study Time - With Parent'].max()+binsize, binsize)

plt.figure(figsize=[8, 5])
plt.hist(data = pisa, x = 'Out-of-School Study Time - With Parent', color = color2, bins = bins)

plt.xlabel('Out-of-School Study Time - With Parent (h/week)')
plt.ylabel('Frequency')
plt.title('Frequency of Out-of-School Study Times with a Parent');

Each of the above histograms for Out-of-School Study Time reflected exactly what we saw in the Total Out-of-School Study Time histogram. They are all strongly left skewed unimodal distributions, which is not much of a surprise since students generally put in in some Study Time outside of school, but the amount of time a student can dedicate to studying drops thereafter.

Now we can move on to look at the Learning Time distributions.

In [35]:
# Histogram of the Total Learning Time
binsize = 100
bins = np.arange(0, pisa['Learning Time - Total'].max()+binsize, binsize)
plt.figure(figsize=[8, 5])
plt.hist(data = pisa, x = 'Learning Time - Total', color = color3, bins=bins)
plt.xlim(0, 2500)
plt.xlabel('Learning Time - Total (mins/week)')
plt.ylabel('Frequency')
plt.title('Frequency of Total Learning Times');

Although slightly skewed to the right, this distribution is much more normal if we compare to the Out-of-School Study Time distribution. But to understand Learning Time, we must look into each of the subjects.

In [36]:
# Histogram of the Total Learning Time
binsize = 25
bins = np.arange(0, pisa['Learning Time - Mathematics'].max()+binsize, binsize)
plt.figure(figsize=[8, 5])
plt.hist(data = pisa, x = 'Learning Time - Mathematics', color = color3, bins=bins)
plt.xlim(0, 700)
plt.xlabel('Learning Time - Mathematics (mins/week)')
plt.ylabel('Frequency')
plt.title('Frequency of Learning Times for Mathematics');

This distribution for Mathematics related Learning Time generally matches the unimodal and normal distribution that we saw for the Total Learning Time, although it is more sporadic in nature.

In [37]:
# Histogram of the Total Learning Time
binsize = 25
bins = np.arange(0, pisa['Learning Time - Test Language'].max()+binsize, binsize)
plt.figure(figsize=[8, 5])
plt.hist(data = pisa, x = 'Learning Time - Test Language', color = color3, bins=bins)
plt.xlim(0, 700)
plt.xlabel('Learning Time - Test Language (mins/week)')
plt.ylabel('Frequency')
plt.title('Frequency of Learning Times for the Test Language');

Once again, the distribution for Test Language reflect the same distributions that we saw for both Mathematics and the Total Learning Time.

In [38]:
# Histogram of the Total Learning Time
binsize = 25
bins = np.arange(0, pisa['Learning Time - Science'].max()+binsize, binsize)
plt.figure(figsize=[8, 5])
plt.hist(data = pisa, x = 'Learning Time - Science', color = color3, bins=bins)
plt.xlim(0, 700)
plt.xlabel('Learning Time - Science (mins/week)')
plt.ylabel('Frequency')
plt.title('Frequency of Learning Times for Science');

This distribution, on the other hand, shows a different story. For Science we can see a clear right skew.

Since all of the Learning Time variables have values that are beyond 600 minutes, and these values might distort our later plots, we should analyze them and determine if it makes sense to disregard them.

In [39]:
# Select high outliers for the learning time total, using criteria eyeballed from the plot
high_outliers_math = (pisa['Learning Time - Mathematics'] > 600)

print(high_outliers_math.sum())
print(pisa.loc[high_outliers_math,:])
70
                        Country  Student ID  Gender  \
215                     Denmark        1626    Male   
486               Florida (USA)        1561    Male   
526                      Mexico       15126  Female   
1259                     Canada       14395  Female   
1813        Massachusetts (USA)        1435  Female   
1928                     Latvia        2110  Female   
1997                     Mexico       21587  Female   
2067                     Canada       17113    Male   
2622                     Mexico         554    Male   
3112                     Mexico       30815    Male   
3240       United Arab Emirates        7285    Male   
3714                     Canada       19278    Male   
3725       United Arab Emirates        6070  Female   
3757       United Arab Emirates        7660  Female   
4024                     Mexico       33486    Male   
4271                     Canada       17089    Male   
4294                     Canada         259    Male   
4557                   Colombia        8167  Female   
4814                     Mexico        2391  Female   
4844                      Chile         732  Female   
4903                      Chile        6723    Male   
4981                     Brazil       11323  Female   
5094                     Mexico       33319    Male   
5106                     Mexico       29403  Female   
5139                     Mexico       32617  Female   
5549                     Canada       21295    Male   
5582                     Mexico       22770  Female   
5854                     Mexico       14924    Male   
5928                     Mexico       19752    Male   
5933                     Mexico       19889    Male   
...                         ...         ...     ...   
8971                    Iceland         654    Male   
9054       United Arab Emirates       10635    Male   
9145                     Canada        7385    Male   
9266       United Arab Emirates        6839    Male   
9530                      Spain       15657  Female   
9780                     Canada        7769  Female   
9975       United Arab Emirates        1063  Female   
10746                    Canada        7097  Female   
10858  United States of America        3314    Male   
10903      United Arab Emirates        8596    Male   
11214                    Mexico       28612    Male   
11462                    Mexico       22303    Male   
11745            United Kingdom       11879  Female   
11777                  Portugal        4294    Male   
12161                    Mexico       31426    Male   
12522             Florida (USA)         710    Male   
12713                  Colombia        4610  Female   
12837                    Mexico       33633  Female   
13041                    Mexico       29600  Female   
13109                    Mexico       31542    Male   
13140                    Mexico       14715  Female   
13460                     Spain       12575    Male   
13504                      Peru        2819    Male   
14424                    Mexico       29944    Male   
14529                    Canada       18264    Male   
15010                    Mexico       32663    Male   
15023                    Mexico       32425  Female   
15048                    Mexico       33720    Male   
15069                     Spain       21983    Male   
15119                    Mexico       31297  Female   

       Out-of-School Study Time - Homework  \
215                                    1.0   
486                                    2.0   
526                                    1.0   
1259                                   1.0   
1813                                  27.0   
1928                                   2.0   
1997                                   2.0   
2067                                   3.0   
2622                                   3.0   
3112                                   3.0   
3240                                   3.0   
3714                                   3.0   
3725                                  17.0   
3757                                   5.0   
4024                                   7.0   
4271                                   7.0   
4294                                   2.0   
4557                                   4.0   
4814                                  10.0   
4844                                   4.0   
4903                                   4.0   
4981                                   2.0   
5094                                   8.0   
5106                                   7.0   
5139                                   7.0   
5549                                   2.0   
5582                                  10.0   
5854                                   2.0   
5928                                   4.0   
5933                                   1.0   
...                                    ...   
8971                                   3.0   
9054                                   4.0   
9145                                  14.0   
9266                                   3.0   
9530                                  12.0   
9780                                   7.0   
9975                                  19.0   
10746                                  1.0   
10858                                 14.0   
10903                                 10.0   
11214                                  4.0   
11462                                  2.0   
11745                                  6.0   
11777                                  5.0   
12161                                 15.0   
12522                                 15.0   
12713                                  6.0   
12837                                 10.0   
13041                                  5.0   
13109                                  4.0   
13140                                  5.0   
13460                                  2.0   
13504                                  3.0   
14424                                  2.0   
14529                                  1.0   
15010                                  2.0   
15023                                  5.0   
15048                                 10.0   
15069                                  2.0   
15119                                  3.0   

       Out-of-School Study Time - Guided Homework  \
215                                           1.0   
486                                           2.0   
526                                           1.0   
1259                                          1.0   
1813                                          0.0   
1928                                          2.0   
1997                                          2.0   
2067                                          1.0   
2622                                          2.0   
3112                                          2.0   
3240                                          2.0   
3714                                          0.0   
3725                                          6.0   
3757                                          0.0   
4024                                          2.0   
4271                                          0.0   
4294                                          0.0   
4557                                          2.0   
4814                                          4.0   
4844                                          2.0   
4903                                          1.0   
4981                                          1.0   
5094                                          8.0   
5106                                          1.0   
5139                                          2.0   
5549                                          2.0   
5582                                         10.0   
5854                                          2.0   
5928                                          1.0   
5933                                          1.0   
...                                           ...   
8971                                          1.0   
9054                                          2.0   
9145                                          4.0   
9266                                          0.0   
9530                                          0.0   
9780                                          2.0   
9975                                          7.0   
10746                                         0.0   
10858                                         2.0   
10903                                         2.0   
11214                                         2.0   
11462                                         1.0   
11745                                         1.0   
11777                                         5.0   
12161                                         1.0   
12522                                         2.0   
12713                                         2.0   
12837                                         0.0   
13041                                         1.0   
13109                                         1.0   
13140                                         3.0   
13460                                         1.0   
13504                                         3.0   
14424                                         1.0   
14529                                         0.0   
15010                                         1.0   
15023                                         4.0   
15048                                         5.0   
15069                                         0.0   
15119                                         2.0   

       Out-of-School Study Time - Personal Tutor  \
215                                          1.0   
486                                          0.0   
526                                          0.0   
1259                                         0.0   
1813                                         0.0   
1928                                         1.0   
1997                                         2.0   
2067                                         0.0   
2622                                         0.0   
3112                                         0.0   
3240                                         1.0   
3714                                         0.0   
3725                                         0.0   
3757                                         0.0   
4024                                         0.0   
4271                                         0.0   
4294                                         0.0   
4557                                         3.0   
4814                                         2.0   
4844                                         1.0   
4903                                         0.0   
4981                                         0.0   
5094                                         0.0   
5106                                         0.0   
5139                                         0.0   
5549                                         0.0   
5582                                         7.0   
5854                                        10.0   
5928                                         0.0   
5933                                         0.0   
...                                          ...   
8971                                         0.0   
9054                                         6.0   
9145                                         1.0   
9266                                         0.0   
9530                                         0.0   
9780                                         4.0   
9975                                         4.0   
10746                                        0.0   
10858                                        0.0   
10903                                        0.0   
11214                                        0.0   
11462                                        0.0   
11745                                        0.0   
11777                                       12.0   
12161                                        0.0   
12522                                        0.0   
12713                                        1.0   
12837                                        2.0   
13041                                        0.0   
13109                                        0.0   
13140                                        1.0   
13460                                        0.0   
13504                                        0.0   
14424                                        0.0   
14529                                        0.0   
15010                                        5.0   
15023                                        2.0   
15048                                        0.0   
15069                                        0.0   
15119                                        3.0   

       Out-of-School Study Time - Commercial Company  \
215                                              1.0   
486                                              0.0   
526                                              0.0   
1259                                             0.0   
1813                                             0.0   
1928                                             2.0   
1997                                             1.0   
2067                                             0.0   
2622                                             0.0   
3112                                             0.0   
3240                                             0.0   
3714                                             0.0   
3725                                             6.0   
3757                                             0.0   
4024                                             0.0   
4271                                             0.0   
4294                                             0.0   
4557                                             1.0   
4814                                             0.0   
4844                                             1.0   
4903                                             0.0   
4981                                             0.0   
5094                                             0.0   
5106                                             0.0   
5139                                             0.0   
5549                                             0.0   
5582                                             0.0   
5854                                             0.0   
5928                                             0.0   
5933                                             0.0   
...                                              ...   
8971                                             0.0   
9054                                             0.0   
9145                                             0.0   
9266                                             0.0   
9530                                             0.0   
9780                                             6.0   
9975                                             0.0   
10746                                            0.0   
10858                                            0.0   
10903                                            0.0   
11214                                            0.0   
11462                                            0.0   
11745                                            0.0   
11777                                            0.0   
12161                                            0.0   
12522                                            0.0   
12713                                            0.0   
12837                                            0.0   
13041                                            0.0   
13109                                            0.0   
13140                                            0.0   
13460                                            0.0   
13504                                            0.0   
14424                                            0.0   
14529                                            0.0   
15010                                            2.0   
15023                                            0.0   
15048                                            0.0   
15069                                            0.0   
15119                                            0.0   

       Out-of-School Study Time - With Parent  Learning Time - Mathematics  \
215                                       1.0                        960.0   
486                                       2.0                        720.0   
526                                       0.0                        630.0   
1259                                      1.0                        750.0   
1813                                      1.0                        830.0   
1928                                      1.0                        640.0   
1997                                      2.0                       1000.0   
2067                                      3.0                        650.0   
2622                                      0.0                        960.0   
3112                                      2.0                        900.0   
3240                                      0.0                        630.0   
3714                                      0.0                        630.0   
3725                                      4.0                        630.0   
3757                                      1.0                        900.0   
4024                                      1.0                        720.0   
4271                                      0.0                        720.0   
4294                                      1.0                        720.0   
4557                                      2.0                        840.0   
4814                                      3.0                        720.0   
4844                                      0.0                        960.0   
4903                                      2.0                        840.0   
4981                                      0.0                        800.0   
5094                                      1.0                        720.0   
5106                                      2.0                        750.0   
5139                                      2.0                        720.0   
5549                                      1.0                        602.0   
5582                                      0.0                        840.0   
5854                                      0.0                        720.0   
5928                                      0.0                        700.0   
5933                                      0.0                        980.0   
...                                       ...                          ...   
8971                                      0.0                        640.0   
9054                                      3.0                        640.0   
9145                                      1.0                        960.0   
9266                                      0.0                        810.0   
9530                                      0.0                        800.0   
9780                                      3.0                        675.0   
9975                                     14.0                        720.0   
10746                                     0.0                       1200.0   
10858                                     1.0                        720.0   
10903                                     0.0                        720.0   
11214                                     2.0                        720.0   
11462                                     0.0                        900.0   
11745                                     0.0                        875.0   
11777                                     3.0                        630.0   
12161                                     1.0                        720.0   
12522                                     4.0                        700.0   
12713                                     0.0                        990.0   
12837                                     0.0                        800.0   
13041                                     1.0                        720.0   
13109                                     1.0                       1440.0   
13140                                     2.0                        720.0   
13460                                     0.0                        720.0   
13504                                     1.0                        720.0   
14424                                     1.0                        720.0   
14529                                     0.0                        650.0   
15010                                     1.0                        630.0   
15023                                     2.0                        720.0   
15048                                     0.0                       1200.0   
15069                                     0.0                        720.0   
15119                                     1.0                        750.0   

       Learning Time - Test Language  Learning Time - Science  \
215                            960.0                    240.0   
486                            720.0                    720.0   
526                            420.0                    420.0   
1259                           750.0                    750.0   
1813                             0.0                    830.0   
1928                           240.0                    640.0   
1997                          1000.0                   1000.0   
2067                           300.0                    300.0   
2622                           960.0                    231.0   
3112                           900.0                    270.0   
3240                           720.0                    495.0   
3714                           630.0                    630.0   
3725                           585.0                    900.0   
3757                           315.0                    810.0   
4024                           720.0                    480.0   
4271                           720.0                    810.0   
4294                           720.0                    720.0   
4557                           300.0                    480.0   
4814                           480.0                   1800.0   
4844                           960.0                    240.0   
4903                           840.0                    480.0   
4981                           600.0                    200.0   
5094                           720.0                    720.0   
5106                           300.0                    400.0   
5139                           400.0                    420.0   
5549                           602.0                    602.0   
5582                           840.0                    180.0   
5854                           480.0                    480.0   
5928                           250.0                    200.0   
5933                           250.0                    250.0   
...                              ...                      ...   
8971                           640.0                    640.0   
9054                           160.0                   1200.0   
9145                           240.0                    960.0   
9266                           360.0                   1260.0   
9530                           250.0                    800.0   
9780                           300.0                    375.0   
9975                           225.0                    900.0   
10746                          300.0                    300.0   
10858                          270.0                      0.0   
10903                          160.0                   1800.0   
11214                          480.0                    120.0   
11462                          720.0                    480.0   
11745                          875.0                    875.0   
11777                          360.0                    180.0   
12161                          600.0                    600.0   
12522                          490.0                    350.0   
12713                          440.0                    440.0   
12837                          600.0                    800.0   
13041                          600.0                    420.0   
13109                          600.0                    720.0   
13140                          600.0                    480.0   
13460                          240.0                    180.0   
13504                          240.0                    480.0   
14424                          480.0                    480.0   
14529                          800.0                    750.0   
15010                          270.0                    270.0   
15023                          720.0                    480.0   
15048                          300.0                    600.0   
15069                          720.0                    840.0   
15119                          300.0                      0.0   

       Average Math Score  Average Reading Score  Average Science Score  \
215             446.57268              340.24700              296.28870   
486             377.01356              366.79140              421.24208   
526             491.36158              549.50768              505.16600   
1259            486.29850              465.54912              497.33308   
1813            468.38292              518.60900              520.83178   
1928            605.24222              635.29316              589.27638   
1997            365.95264              405.26098              366.97128   
2067            637.87972              615.71518              620.51472   
2622            497.98256              524.29340              530.62290   
3112            360.18850              369.67840              358.01940   
3240            407.78160              409.61528              447.81796   
3714            564.65962              618.44178              650.54084   
3725            402.56270              381.11392              381.70458   
3757            595.34974              626.07918              633.10332   
4024            442.75588              401.27506              399.23536   
4271            661.01420              650.59978              639.35100   
4294            537.16310              519.08082              493.41664   
4557            452.41472              443.38786              486.14322   
4814            365.17370              403.03690              413.40916   
4844            375.53356              450.06006              430.66020   
4903            439.40644              451.63720              517.00858   
4981            586.00250              582.47154              560.92874   
5094            445.71584              392.13288              387.67250   
5106            484.89640              494.54142              435.88212   
5139            467.52610              510.42760              498.73184   
5549            492.76368              493.89972              523.16302   
5582            412.68890              424.95988              415.46064   
5854            379.50616              362.94208              346.45654   
5928            430.99390              401.91660              421.61506   
5933            457.94514              471.68582              430.93994   
...                   ...                    ...                    ...   
8971            488.86900              502.72110              448.37746   
9054            453.81680              491.25330              496.02758   
9145            630.16826              649.71764              599.90676   
9266            440.88642              412.18150              413.40916   
9530            404.82162              538.14904              503.67402   
9780            493.15314              504.15260              515.79632   
9975            454.20626              499.62502              538.54904   
10746           379.89564              441.08434              428.14248   
10858           674.80138              647.71278              667.04588   
10903           612.87584              503.12206              573.14434   
11214           446.80636              470.72352              442.31630   
11462           440.88644              472.64816              495.28160   
11745           574.39634              588.11116              576.59456   
11777           530.54214              477.21924              490.24618   
12161           428.11184              429.34312              428.98172   
12522           574.70790              623.49404              598.88100   
12713           416.34990              456.89112              466.65424   
12837           520.41596              566.90310              514.49086   
13041           428.11186              488.18694              457.98210   
13109           427.95608              402.39778              431.12644   
13140           387.60712              415.90474              432.05894   
13460           387.60710              358.45118              415.27412   
13504           327.47314              234.47046              307.10554   
14424           414.94782              415.95066              391.68222   
14529           355.51488              348.10612              328.55278   
15010           352.86650              280.02096              364.54682   
15023           370.93784              383.81460              341.23462   
15048           401.08270              418.67726              411.91718   
15069           376.23462              383.87286              485.95676   
15119           397.73328              444.34102              396.15814   

       Average Total Score        Education - Father  \
215             361.036127  Bachelor’s or equivalent   
486             388.349013  Bachelor’s or equivalent   
526             515.345087  Bachelor’s or equivalent   
1259            483.060233  Bachelor’s or equivalent   
1813            502.607900  Bachelor’s or equivalent   
1928            609.937253  Bachelor’s or equivalent   
1997            379.394967  Bachelor’s or equivalent   
2067            624.703207  Bachelor’s or equivalent   
2622            517.632953  Bachelor’s or equivalent   
3112            362.628767  Bachelor’s or equivalent   
3240            421.738280  Bachelor’s or equivalent   
3714            611.214080  Bachelor’s or equivalent   
3725            388.460400  Bachelor’s or equivalent   
3757            618.177413      Short-cycle tertiary   
4024            414.422100           Lower secondary   
4271            650.321660      Short-cycle tertiary   
4294            516.553520  Bachelor’s or equivalent   
4557            460.648600  Bachelor’s or equivalent   
4814            393.873253  Bachelor’s or equivalent   
4844            418.751273  Bachelor’s or equivalent   
4903            469.350740  Bachelor’s or equivalent   
4981            576.467593  Bachelor’s or equivalent   
5094            408.507073      Short-cycle tertiary   
5106            471.773313      Short-cycle tertiary   
5139            492.228513      Short-cycle tertiary   
5549            503.275473      Short-cycle tertiary   
5582            417.703140      Short-cycle tertiary   
5854            362.968260      Short-cycle tertiary   
5928            418.175187      Short-cycle tertiary   
5933            453.523633      Short-cycle tertiary   
...                    ...                       ...   
8971            479.989187  Bachelor’s or equivalent   
9054            480.365893  Bachelor’s or equivalent   
9145            626.597553  Bachelor’s or equivalent   
9266            422.159027      Short-cycle tertiary   
9530            482.214893      Short-cycle tertiary   
9780            504.367353      Short-cycle tertiary   
9975            497.460107  Bachelor’s or equivalent   
10746           416.374153      Short-cycle tertiary   
10858           663.186680  Bachelor’s or equivalent   
10903           563.047413      Short-cycle tertiary   
11214           453.282060           Upper secondary   
11462           469.605400      Short-cycle tertiary   
11745           579.700687      Short-cycle tertiary   
11777           499.335853  Bachelor’s or equivalent   
12161           428.812227           Lower secondary   
12522           599.027647  Bachelor’s or equivalent   
12713           446.631753  Bachelor’s or equivalent   
12837           533.936640  Bachelor’s or equivalent   
13041           458.093633      Short-cycle tertiary   
13109           420.493433      Short-cycle tertiary   
13140           411.856933      Short-cycle tertiary   
13460           387.110800      Short-cycle tertiary   
13504           289.683047  Bachelor’s or equivalent   
14424           407.526900           Upper secondary   
14529           344.057927            Post-secondary   
15010           332.478093           Early childhood   
15023           365.329020           Early childhood   
15048           410.559047           Early childhood   
15069           415.354747      Short-cycle tertiary   
15119           412.744147           Upper secondary   

             Education - Mother  Out-of-School Study Time - Total  \
215    Bachelor’s or equivalent                               5.0   
486    Bachelor’s or equivalent                               6.0   
526    Bachelor’s or equivalent                               2.0   
1259   Bachelor’s or equivalent                               3.0   
1813   Bachelor’s or equivalent                              28.0   
1928   Bachelor’s or equivalent                               8.0   
1997   Bachelor’s or equivalent                               9.0   
2067   Bachelor’s or equivalent                               7.0   
2622       Short-cycle tertiary                               5.0   
3112       Short-cycle tertiary                               7.0   
3240       Short-cycle tertiary                               6.0   
3714       Short-cycle tertiary                               3.0   
3725       Short-cycle tertiary                              33.0   
3757       Short-cycle tertiary                               6.0   
4024       Short-cycle tertiary                              10.0   
4271       Short-cycle tertiary                               7.0   
4294       Short-cycle tertiary                               3.0   
4557       Short-cycle tertiary                              12.0   
4814       Short-cycle tertiary                              19.0   
4844       Short-cycle tertiary                               8.0   
4903       Short-cycle tertiary                               7.0   
4981       Short-cycle tertiary                               3.0   
5094       Short-cycle tertiary                              17.0   
5106       Short-cycle tertiary                              10.0   
5139       Short-cycle tertiary                              11.0   
5549       Short-cycle tertiary                               5.0   
5582       Short-cycle tertiary                              27.0   
5854       Short-cycle tertiary                              14.0   
5928       Short-cycle tertiary                               5.0   
5933       Short-cycle tertiary                               2.0   
...                         ...                               ...   
8971       Short-cycle tertiary                               4.0   
9054       Short-cycle tertiary                              15.0   
9145       Short-cycle tertiary                              20.0   
9266       Short-cycle tertiary                               3.0   
9530             Post-secondary                              12.0   
9780             Post-secondary                              22.0   
9975             Post-secondary                              44.0   
10746           Upper secondary                               1.0   
10858           Upper secondary                              17.0   
10903           Upper secondary                              12.0   
11214           Upper secondary                               8.0   
11462           Upper secondary                               3.0   
11745           Upper secondary                               7.0   
11777           Upper secondary                              25.0   
12161           Upper secondary                              17.0   
12522           Upper secondary                              21.0   
12713           Upper secondary                               9.0   
12837           Upper secondary                              12.0   
13041           Lower secondary                               7.0   
13109           Lower secondary                               6.0   
13140           Lower secondary                              11.0   
13460           Lower secondary                               3.0   
13504           Lower secondary                               7.0   
14424           Lower secondary                               4.0   
14529                   Primary                               1.0   
15010           Early childhood                              11.0   
15023           Early childhood                              13.0   
15048           Early childhood                              15.0   
15069           Early childhood                               2.0   
15119           Early childhood                               9.0   

       Learning Time - Total  log_study  
215                   2160.0   0.698970  
486                   2160.0   0.778151  
526                   1470.0   0.301030  
1259                  2250.0   0.477121  
1813                  1660.0   1.447158  
1928                  1520.0   0.903090  
1997                  3000.0   0.954243  
2067                  1250.0   0.845098  
2622                  2151.0   0.698970  
3112                  2070.0   0.845098  
3240                  1845.0   0.778151  
3714                  1890.0   0.477121  
3725                  2115.0   1.518514  
3757                  2025.0   0.778151  
4024                  1920.0   1.000000  
4271                  2250.0   0.845098  
4294                  2160.0   0.477121  
4557                  1620.0   1.079181  
4814                  3000.0   1.278754  
4844                  2160.0   0.903090  
4903                  2160.0   0.845098  
4981                  1600.0   0.477121  
5094                  2160.0   1.230449  
5106                  1450.0   1.000000  
5139                  1540.0   1.041393  
5549                  1806.0   0.698970  
5582                  1860.0   1.431364  
5854                  1680.0   1.146128  
5928                  1150.0   0.698970  
5933                  1480.0   0.301030  
...                      ...        ...  
8971                  1920.0   0.602060  
9054                  2000.0   1.176091  
9145                  2160.0   1.301030  
9266                  2430.0   0.477121  
9530                  1850.0   1.079181  
9780                  1350.0   1.342423  
9975                  1845.0   1.643453  
10746                 1800.0   0.000000  
10858                  990.0   1.230449  
10903                 2680.0   1.079181  
11214                 1320.0   0.903090  
11462                 2100.0   0.477121  
11745                 2625.0   0.845098  
11777                 1170.0   1.397940  
12161                 1920.0   1.230449  
12522                 1540.0   1.322219  
12713                 1870.0   0.954243  
12837                 2200.0   1.079181  
13041                 1740.0   0.845098  
13109                 2760.0   0.778151  
13140                 1800.0   1.041393  
13460                 1140.0   0.477121  
13504                 1440.0   0.845098  
14424                 1680.0   0.602060  
14529                 2200.0   0.000000  
15010                 1170.0   1.041393  
15023                 1920.0   1.113943  
15048                 2100.0   1.176091  
15069                 2280.0   0.301030  
15119                 1050.0   0.954243  

[70 rows x 20 columns]
In [40]:
high_outliers_lang = (pisa['Learning Time - Test Language'] > 600)

print(high_outliers_lang.sum())
print(pisa.loc[high_outliers_lang,:])
62
                    Country  Student ID  Gender  \
215                 Denmark        1626    Male   
486           Florida (USA)        1561    Male   
1044   United Arab Emirates        7004  Female   
1259                 Canada       14395  Female   
1429                Denmark        7234  Female   
1732         United Kingdom       10803  Female   
1855    Massachusetts (USA)         587  Female   
1917                  Chile         167    Male   
1997                 Mexico       21587  Female   
2237   United Arab Emirates       10226    Male   
2622                 Mexico         554    Male   
3032                Denmark         784    Male   
3112                 Mexico       30815    Male   
3240   United Arab Emirates        7285    Male   
3642                 Mexico       32062    Male   
3714                 Canada       19278    Male   
4024                 Mexico       33486    Male   
4116                 Canada       12780    Male   
4271                 Canada       17089    Male   
4294                 Canada         259    Male   
4844                  Chile         732  Female   
4903                  Chile        6723    Male   
4995            Netherlands        1606  Female   
5094                 Mexico       33319    Male   
5549                 Canada       21295    Male   
5582                 Mexico       22770  Female   
5603                 Mexico       23081    Male   
6015                 Mexico       32026    Male   
6079                 Mexico       27598  Female   
6127                 Canada       20630  Female   
...                     ...         ...     ...   
8220                 Mexico       19139    Male   
8494                 Mexico       32707    Male   
8771                 Mexico       12761  Female   
8895                 Mexico       32389    Male   
8971                Iceland         654    Male   
8984                 Canada        8745    Male   
9019            Switzerland        5467  Female   
9111                 Canada        7310    Male   
9771                 Canada       14354  Female   
10642                Canada       10176    Male   
10716                Mexico       33767  Female   
11462                Mexico       22303    Male   
11745        United Kingdom       11879  Female   
12261                Mexico       32782    Male   
12481                Mexico       16949  Female   
12526         Florida (USA)         934    Male   
12869  United Arab Emirates        6626  Female   
12886                Mexico       19905    Male   
12904                Mexico       27024    Male   
13062                Mexico       25525    Male   
13222                Mexico       32448    Male   
13316                Canada       16655    Male   
13699                Mexico       32646    Male   
13855                Mexico       33452  Female   
14497                Mexico       27922    Male   
14529                Canada       18264    Male   
14747                Mexico       33391    Male   
15023                Mexico       32425  Female   
15069                 Spain       21983    Male   
15077                Canada       11124  Female   

       Out-of-School Study Time - Homework  \
215                                    1.0   
486                                    2.0   
1044                                   4.0   
1259                                   1.0   
1429                                   2.0   
1732                                  12.0   
1855                                   7.0   
1917                                   2.0   
1997                                   2.0   
2237                                   5.0   
2622                                   3.0   
3032                                   4.0   
3112                                   3.0   
3240                                   3.0   
3642                                  11.0   
3714                                   3.0   
4024                                   7.0   
4116                                   9.0   
4271                                   7.0   
4294                                   2.0   
4844                                   4.0   
4903                                   4.0   
4995                                   7.0   
5094                                   8.0   
5549                                   2.0   
5582                                  10.0   
5603                                  10.0   
6015                                   2.0   
6079                                   4.0   
6127                                   2.0   
...                                    ...   
8220                                   4.0   
8494                                   5.0   
8771                                   3.0   
8895                                   4.0   
8971                                   3.0   
8984                                   5.0   
9019                                   3.0   
9111                                   9.0   
9771                                   2.0   
10642                                  6.0   
10716                                  4.0   
11462                                  2.0   
11745                                  6.0   
12261                                  3.0   
12481                                  8.0   
12526                                  5.0   
12869                                  3.0   
12886                                  2.0   
12904                                 12.0   
13062                                  5.0   
13222                                  7.0   
13316                                  4.0   
13699                                  7.0   
13855                                 15.0   
14497                                  4.0   
14529                                  1.0   
14747                                  0.0   
15023                                  5.0   
15069                                  2.0   
15077                                  2.0   

       Out-of-School Study Time - Guided Homework  \
215                                           1.0   
486                                           2.0   
1044                                          1.0   
1259                                          1.0   
1429                                          0.0   
1732                                         10.0   
1855                                          1.0   
1917                                          0.0   
1997                                          2.0   
2237                                          1.0   
2622                                          2.0   
3032                                          0.0   
3112                                          2.0   
3240                                          2.0   
3642                                          5.0   
3714                                          0.0   
4024                                          2.0   
4116                                          1.0   
4271                                          0.0   
4294                                          0.0   
4844                                          2.0   
4903                                          1.0   
4995                                          1.0   
5094                                          8.0   
5549                                          2.0   
5582                                         10.0   
5603                                         10.0   
6015                                          0.0   
6079                                          4.0   
6127                                          2.0   
...                                           ...   
8220                                          1.0   
8494                                          3.0   
8771                                          2.0   
8895                                          4.0   
8971                                          1.0   
8984                                          2.0   
9019                                          0.0   
9111                                          2.0   
9771                                          1.0   
10642                                         0.0   
10716                                         2.0   
11462                                         1.0   
11745                                         1.0   
12261                                         2.0   
12481                                         2.0   
12526                                         3.0   
12869                                         2.0   
12886                                         2.0   
12904                                         5.0   
13062                                         1.0   
13222                                         7.0   
13316                                         0.0   
13699                                         0.0   
13855                                         0.0   
14497                                         0.0   
14529                                         0.0   
14747                                         0.0   
15023                                         4.0   
15069                                         0.0   
15077                                         1.0   

       Out-of-School Study Time - Personal Tutor  \
215                                          1.0   
486                                          0.0   
1044                                         0.0   
1259                                         0.0   
1429                                         0.0   
1732                                        14.0   
1855                                         0.0   
1917                                         0.0   
1997                                         2.0   
2237                                         0.0   
2622                                         0.0   
3032                                         0.0   
3112                                         0.0   
3240                                         1.0   
3642                                         0.0   
3714                                         0.0   
4024                                         0.0   
4116                                         0.0   
4271                                         0.0   
4294                                         0.0   
4844                                         1.0   
4903                                         0.0   
4995                                         2.0   
5094                                         0.0   
5549                                         0.0   
5582                                         7.0   
5603                                         0.0   
6015                                         0.0   
6079                                         5.0   
6127                                         1.0   
...                                          ...   
8220                                         0.0   
8494                                         0.0   
8771                                         0.0   
8895                                         0.0   
8971                                         0.0   
8984                                         0.0   
9019                                         0.0   
9111                                         0.0   
9771                                         0.0   
10642                                        0.0   
10716                                        2.0   
11462                                        0.0   
11745                                        0.0   
12261                                        0.0   
12481                                        0.0   
12526                                        0.0   
12869                                        0.0   
12886                                        0.0   
12904                                        0.0   
13062                                        0.0   
13222                                        4.0   
13316                                        0.0   
13699                                        0.0   
13855                                        0.0   
14497                                        6.0   
14529                                        0.0   
14747                                        0.0   
15023                                        2.0   
15069                                        0.0   
15077                                        1.0   

       Out-of-School Study Time - Commercial Company  \
215                                              1.0   
486                                              0.0   
1044                                             0.0   
1259                                             0.0   
1429                                             0.0   
1732                                             2.0   
1855                                             0.0   
1917                                             0.0   
1997                                             1.0   
2237                                             0.0   
2622                                             0.0   
3032                                             0.0   
3112                                             0.0   
3240                                             0.0   
3642                                             0.0   
3714                                             0.0   
4024                                             0.0   
4116                                             0.0   
4271                                             0.0   
4294                                             0.0   
4844                                             1.0   
4903                                             0.0   
4995                                             2.0   
5094                                             0.0   
5549                                             0.0   
5582                                             0.0   
5603                                             0.0   
6015                                             0.0   
6079                                             0.0   
6127                                             0.0   
...                                              ...   
8220                                             0.0   
8494                                             0.0   
8771                                             0.0   
8895                                             0.0   
8971                                             0.0   
8984                                             1.0   
9019                                             0.0   
9111                                             0.0   
9771                                             0.0   
10642                                            0.0   
10716                                            0.0   
11462                                            0.0   
11745                                            0.0   
12261                                            1.0   
12481                                            0.0   
12526                                            0.0   
12869                                            0.0   
12886                                            0.0   
12904                                            0.0   
13062                                            0.0   
13222                                            4.0   
13316                                            0.0   
13699                                            0.0   
13855                                            0.0   
14497                                            0.0   
14529                                            0.0   
14747                                            0.0   
15023                                            0.0   
15069                                            0.0   
15077                                            0.0   

       Out-of-School Study Time - With Parent  Learning Time - Mathematics  \
215                                       1.0                        960.0   
486                                       2.0                        720.0   
1044                                      0.0                        540.0   
1259                                      1.0                        750.0   
1429                                      0.0                        180.0   
1732                                     10.0                        360.0   
1855                                      0.0                        240.0   
1917                                      1.0                        540.0   
1997                                      2.0                       1000.0   
2237                                      1.0                        540.0   
2622                                      0.0                        960.0   
3032                                      1.0                        360.0   
3112                                      2.0                        900.0   
3240                                      0.0                        630.0   
3642                                      2.0                        200.0   
3714                                      0.0                        630.0   
4024                                      1.0                        720.0   
4116                                      1.0                        275.0   
4271                                      0.0                        720.0   
4294                                      1.0                        720.0   
4844                                      0.0                        960.0   
4903                                      2.0                        840.0   
4995                                      1.0                        480.0   
5094                                      1.0                        720.0   
5549                                      1.0                        602.0   
5582                                      0.0                        840.0   
5603                                      0.0                        440.0   
6015                                      0.0                        450.0   
6079                                      0.0                        600.0   
6127                                      1.0                        462.0   
...                                       ...                          ...   
8220                                      0.0                        180.0   
8494                                      3.0                        180.0   
8771                                      2.0                       1080.0   
8895                                      3.0                        200.0   
8971                                      0.0                        640.0   
8984                                      1.0                        450.0   
9019                                      0.0                        225.0   
9111                                      0.0                        500.0   
9771                                      0.0                        576.0   
10642                                     0.0                          0.0   
10716                                     2.0                        540.0   
11462                                     0.0                        900.0   
11745                                     0.0                        875.0   
12261                                     0.0                        600.0   
12481                                     2.0                        600.0   
12526                                     4.0                        300.0   
12869                                     3.0                        480.0   
12886                                     3.0                        150.0   
12904                                     0.0                        300.0   
13062                                     0.0                        350.0   
13222                                     7.0                        200.0   
13316                                     0.0                        375.0   
13699                                     1.0                        360.0   
13855                                     3.0                        480.0   
14497                                     0.0                        220.0   
14529                                     0.0                        650.0   
14747                                     0.0                        240.0   
15023                                     2.0                        720.0   
15069                                     0.0                        720.0   
15077                                     2.0                        462.0   

       Learning Time - Test Language  Learning Time - Science  \
215                            960.0                    240.0   
486                            720.0                    720.0   
1044                           720.0                    540.0   
1259                           750.0                    750.0   
1429                           720.0                    720.0   
1732                           720.0                    300.0   
1855                          1200.0                    240.0   
1917                           720.0                    180.0   
1997                          1000.0                   1000.0   
2237                           810.0                   1080.0   
2622                           960.0                    231.0   
3032                           720.0                    135.0   
3112                           900.0                    270.0   
3240                           720.0                    495.0   
3642                           720.0                    400.0   
3714                           630.0                    630.0   
4024                           720.0                    480.0   
4116                           900.0                    275.0   
4271                           720.0                    810.0   
4294                           720.0                    720.0   
4844                           960.0                    240.0   
4903                           840.0                    480.0   
4995                           640.0                   1440.0   
5094                           720.0                    720.0   
5549                           602.0                    602.0   
5582                           840.0                    180.0   
5603                           660.0                    990.0   
6015                           630.0                    540.0   
6079                           900.0                    300.0   
6127                           616.0                    308.0   
...                              ...                      ...   
8220                           960.0                    240.0   
8494                          1305.0                    180.0   
8771                           840.0                    600.0   
8895                           720.0                    720.0   
8971                           640.0                    640.0   
8984                           675.0                    375.0   
9019                          1530.0                    180.0   
9111                          1000.0                      0.0   
9771                           680.0                    340.0   
10642                          800.0                    800.0   
10716                          720.0                    540.0   
11462                          720.0                    480.0   
11745                          875.0                    875.0   
12261                          840.0                    360.0   
12481                          700.0                    600.0   
12526                         1800.0                    300.0   
12869                          630.0                    225.0   
12886                          750.0                    150.0   
12904                          720.0                    240.0   
13062                         1750.0                    200.0   
13222                          850.0                    700.0   
13316                          750.0                      0.0   
13699                          765.0                    315.0   
13855                          720.0                    480.0   
14497                          660.0                    660.0   
14529                          800.0                    750.0   
14747                         1450.0                    550.0   
15023                          720.0                    480.0   
15069                          720.0                    840.0   
15077                          616.0                    308.0   

       Average Math Score  Average Reading Score  Average Science Score  \
215             446.57268              340.24700              296.28870   
486             377.01356              366.79140              421.24208   
1044            517.53388              514.39918              491.55164   
1259            486.29850              465.54912              497.33308   
1429            478.27542              548.23678              490.43268   
1732            348.58234              338.61834              334.33420   
1855            553.36500              539.97596              523.44276   
1917            442.13274              446.98594              422.54756   
1997            365.95264              405.26098              366.97128   
2237            665.76568              606.57300              707.23610   
2622            497.98256              524.29340              530.62290   
3032            658.21000              664.87442              683.55090   
3112            360.18850              369.67840              358.01940   
3240            407.78160              409.61528              447.81796   
3642            498.83936              445.06126              452.38716   
3714            564.65962              618.44178              650.54084   
4024            442.75588              401.27506              399.23536   
4116            524.70012              461.50112              516.35582   
4271            661.01420              650.59978              639.35100   
4294            537.16310              519.08082              493.41664   
4844            375.53356              450.06006              430.66020   
4903            439.40644              451.63720              517.00858   
4995            534.04738              534.89238              529.22420   
5094            445.71584              392.13288              387.67250   
5549            492.76368              493.89972              523.16302   
5582            412.68890              424.95988              415.46064   
5603            570.03428              569.36276              570.99964   
6015            405.83424              382.34916              352.05148   
6079            547.67876              525.91666              511.22714   
6127            363.92742              456.09682              425.99776   
...                   ...                    ...                    ...   
8220            540.51256              514.10874              529.41066   
8494            417.51830              407.85102              403.89778   
8771            363.92742              405.10210              387.85902   
8895            440.34118              485.80006              442.40954   
8971            488.86900              502.72110              448.37746   
8984            483.18272              380.02350              424.78552   
9019            560.92070              557.68910              573.05110   
9111            559.20704              592.45878              564.93848   
9771            493.07524              536.71928              501.15630   
10642           526.80326              513.62758              551.04438   
10716           392.51442              477.78148              424.97202   
11462           440.88644              472.64816              495.28160   
11745           574.39634              588.11116              576.59456   
12261           387.99658              453.32130              347.38906   
12481           548.92510              520.43594              533.42038   
12526           496.03522              414.58734              454.99812   
12869           432.31812              536.79874              551.51062   
12886           353.17806              390.04782              381.42482   
12904           455.91994              469.19982              462.08504   
13062           447.89686              471.84624              389.53748   
13222           364.55056              343.21422              310.83548   
13316           623.31360              655.17090              647.37038   
13699           438.78328              377.45728              455.09138   
13855           437.77070              523.93088              450.14920   
14497           398.04486              450.67488              445.20700   
14529           355.51488              348.10612              328.55278   
14747           378.10408              373.84854              401.28682   
15023           370.93784              383.81460              341.23462   
15069           376.23462              383.87286              485.95676   
15077           479.83332              465.70796              444.92728   

       Average Total Score        Education - Father  \
215             361.036127  Bachelor’s or equivalent   
486             388.349013  Bachelor’s or equivalent   
1044            507.828233  Bachelor’s or equivalent   
1259            483.060233  Bachelor’s or equivalent   
1429            505.648293      Short-cycle tertiary   
1732            340.511627  Bachelor’s or equivalent   
1855            538.927907  Bachelor’s or equivalent   
1917            437.222080  Bachelor’s or equivalent   
1997            379.394967  Bachelor’s or equivalent   
2237            659.858260  Bachelor’s or equivalent   
2622            517.632953  Bachelor’s or equivalent   
3032            668.878440  Bachelor’s or equivalent   
3112            362.628767  Bachelor’s or equivalent   
3240            421.738280  Bachelor’s or equivalent   
3642            465.429260  Bachelor’s or equivalent   
3714            611.214080  Bachelor’s or equivalent   
4024            414.422100           Lower secondary   
4116            500.852353      Short-cycle tertiary   
4271            650.321660      Short-cycle tertiary   
4294            516.553520  Bachelor’s or equivalent   
4844            418.751273  Bachelor’s or equivalent   
4903            469.350740  Bachelor’s or equivalent   
4995            532.721320  Bachelor’s or equivalent   
5094            408.507073      Short-cycle tertiary   
5549            503.275473      Short-cycle tertiary   
5582            417.703140      Short-cycle tertiary   
5603            570.132227      Short-cycle tertiary   
6015            380.078293      Short-cycle tertiary   
6079            528.274187      Short-cycle tertiary   
6127            415.340667      Short-cycle tertiary   
...                    ...                       ...   
8220            528.010653      Short-cycle tertiary   
8494            409.755700      Short-cycle tertiary   
8771            385.629513      Short-cycle tertiary   
8895            456.183593      Short-cycle tertiary   
8971            479.989187  Bachelor’s or equivalent   
8984            429.330580  Bachelor’s or equivalent   
9019            563.886967  Bachelor’s or equivalent   
9111            572.201433  Bachelor’s or equivalent   
9771            510.316940      Short-cycle tertiary   
10642           530.491740      Short-cycle tertiary   
10716           431.755973      Short-cycle tertiary   
11462           469.605400      Short-cycle tertiary   
11745           579.700687      Short-cycle tertiary   
12261           396.235647           Lower secondary   
12481           534.260473      Short-cycle tertiary   
12526           455.206893  Bachelor’s or equivalent   
12869           506.875827  Bachelor’s or equivalent   
12886           374.883567      Short-cycle tertiary   
12904           462.401600      Short-cycle tertiary   
13062           436.426860      Short-cycle tertiary   
13222           339.533420      Short-cycle tertiary   
13316           641.951627  Bachelor’s or equivalent   
13699           423.777313           Lower secondary   
13855           470.616927                   Primary   
14497           431.308913      Short-cycle tertiary   
14529           344.057927            Post-secondary   
14747           384.413147                   Primary   
15023           365.329020           Early childhood   
15069           415.354747      Short-cycle tertiary   
15077           463.489520      Short-cycle tertiary   

             Education - Mother  Out-of-School Study Time - Total  \
215    Bachelor’s or equivalent                               5.0   
486    Bachelor’s or equivalent                               6.0   
1044   Bachelor’s or equivalent                               5.0   
1259   Bachelor’s or equivalent                               3.0   
1429   Bachelor’s or equivalent                               2.0   
1732   Bachelor’s or equivalent                              48.0   
1855   Bachelor’s or equivalent                               8.0   
1917   Bachelor’s or equivalent                               3.0   
1997   Bachelor’s or equivalent                               9.0   
2237   Bachelor’s or equivalent                               7.0   
2622       Short-cycle tertiary                               5.0   
3032       Short-cycle tertiary                               5.0   
3112       Short-cycle tertiary                               7.0   
3240       Short-cycle tertiary                               6.0   
3642       Short-cycle tertiary                              18.0   
3714       Short-cycle tertiary                               3.0   
4024       Short-cycle tertiary                              10.0   
4116       Short-cycle tertiary                              11.0   
4271       Short-cycle tertiary                               7.0   
4294       Short-cycle tertiary                               3.0   
4844       Short-cycle tertiary                               8.0   
4903       Short-cycle tertiary                               7.0   
4995       Short-cycle tertiary                              13.0   
5094       Short-cycle tertiary                              17.0   
5549       Short-cycle tertiary                               5.0   
5582       Short-cycle tertiary                              27.0   
5603       Short-cycle tertiary                              20.0   
6015       Short-cycle tertiary                               2.0   
6079       Short-cycle tertiary                              13.0   
6127       Short-cycle tertiary                               6.0   
...                         ...                               ...   
8220       Short-cycle tertiary                               5.0   
8494       Short-cycle tertiary                              11.0   
8771       Short-cycle tertiary                               7.0   
8895       Short-cycle tertiary                              11.0   
8971       Short-cycle tertiary                               4.0   
8984       Short-cycle tertiary                               9.0   
9019       Short-cycle tertiary                               3.0   
9111       Short-cycle tertiary                              11.0   
9771             Post-secondary                               3.0   
10642           Upper secondary                               6.0   
10716           Upper secondary                              10.0   
11462           Upper secondary                               3.0   
11745           Upper secondary                               7.0   
12261           Upper secondary                               6.0   
12481           Upper secondary                              12.0   
12526           Upper secondary                              12.0   
12869           Upper secondary                               8.0   
12886           Lower secondary                               7.0   
12904           Lower secondary                              17.0   
13062           Lower secondary                               6.0   
13222           Lower secondary                              29.0   
13316           Lower secondary                               4.0   
13699           Lower secondary                               8.0   
13855           Lower secondary                              18.0   
14497                   Primary                              10.0   
14529                   Primary                               1.0   
14747                   Primary                               0.0   
15023           Early childhood                              13.0   
15069           Early childhood                               2.0   
15077           Early childhood                               6.0   

       Learning Time - Total  log_study  
215                   2160.0   0.698970  
486                   2160.0   0.778151  
1044                  1800.0   0.698970  
1259                  2250.0   0.477121  
1429                  1620.0   0.301030  
1732                  1380.0   1.681241  
1855                  1680.0   0.903090  
1917                  1440.0   0.477121  
1997                  3000.0   0.954243  
2237                  2430.0   0.845098  
2622                  2151.0   0.698970  
3032                  1215.0   0.698970  
3112                  2070.0   0.845098  
3240                  1845.0   0.778151  
3642                  1320.0   1.255273  
3714                  1890.0   0.477121  
4024                  1920.0   1.000000  
4116                  1450.0   1.041393  
4271                  2250.0   0.845098  
4294                  2160.0   0.477121  
4844                  2160.0   0.903090  
4903                  2160.0   0.845098  
4995                  2560.0   1.113943  
5094                  2160.0   1.230449  
5549                  1806.0   0.698970  
5582                  1860.0   1.431364  
5603                  2090.0   1.301030  
6015                  1620.0   0.301030  
6079                  1800.0   1.113943  
6127                  1386.0   0.778151  
...                      ...        ...  
8220                  1380.0   0.698970  
8494                  1665.0   1.041393  
8771                  2520.0   0.845098  
8895                  1640.0   1.041393  
8971                  1920.0   0.602060  
8984                  1500.0   0.954243  
9019                  1935.0   0.477121  
9111                  1500.0   1.041393  
9771                  1596.0   0.477121  
10642                 1600.0   0.778151  
10716                 1800.0   1.000000  
11462                 2100.0   0.477121  
11745                 2625.0   0.845098  
12261                 1800.0   0.778151  
12481                 1900.0   1.079181  
12526                 2400.0   1.079181  
12869                 1335.0   0.903090  
12886                 1050.0   0.845098  
12904                 1260.0   1.230449  
13062                 2300.0   0.778151  
13222                 1750.0   1.462398  
13316                 1125.0   0.602060  
13699                 1440.0   0.903090  
13855                 1680.0   1.255273  
14497                 1540.0   1.000000  
14529                 2200.0   0.000000  
14747                 2240.0       -inf  
15023                 1920.0   1.113943  
15069                 2280.0   0.301030  
15077                 1386.0   0.778151  

[62 rows x 20 columns]
In [41]:
high_outliers_sci = (pisa['Learning Time - Science'] > 600)

print(high_outliers_sci.sum())
print(pisa.loc[high_outliers_sci,:])
180
                    Country  Student ID  Gender  \
180                 Austria        2024    Male   
189    United Arab Emirates        6450    Male   
193       Connecticut (USA)         905  Female   
224                   Chile         462    Male   
265    United Arab Emirates        4439    Male   
269          China-Shanghai        5053    Male   
372                   Chile         234    Male   
486           Florida (USA)        1561    Male   
504    United Arab Emirates        2784  Female   
521    United Arab Emirates         523    Male   
528    United Arab Emirates         305  Female   
574    United Arab Emirates        1343    Male   
590                  Brazil       18569  Female   
606    United Arab Emirates        1200    Male   
665    United Arab Emirates        4740  Female   
718    United Arab Emirates        6230    Male   
1015                 Canada        3257  Female   
1225               Portugal        2432  Female   
1255               Thailand        6355  Female   
1259                 Canada       14395  Female   
1429                Denmark        7234  Female   
1447   United Arab Emirates       11338  Female   
1527   United Arab Emirates       10443  Female   
1654   United Arab Emirates        9776    Male   
1753                 Brazil       17671  Female   
1771         United Kingdom        9960    Male   
1789   United Arab Emirates        4962  Female   
1813    Massachusetts (USA)        1435  Female   
1828   United Arab Emirates        1560    Male   
1928                 Latvia        2110  Female   
...                     ...         ...     ...   
12666  United Arab Emirates        2654  Female   
12744                Mexico       29566  Female   
12767                Mexico       29964    Male   
12837                Mexico       33633  Female   
12945                Mexico       19824    Male   
13102                Mexico       32610    Male   
13105                Mexico       32584  Female   
13109                Mexico       31542    Male   
13222                Mexico       32448    Male   
13239                Mexico       26943  Female   
13690                Mexico       32765    Male   
13801                Mexico       31559    Male   
13811                Mexico       32571    Male   
13824                Mexico       31481    Male   
13853                Mexico       33439  Female   
13856                Mexico       33490  Female   
14428                Mexico       28350    Male   
14433                Mexico       26889  Female   
14497                Mexico       27922    Male   
14499                Mexico       19776  Female   
14529                Canada       18264    Male   
14625                Mexico       25258  Female   
14626                Mexico       25521  Female   
14839                Mexico       33616    Male   
14846                Mexico       31610    Male   
14855                Mexico       26412    Male   
14870                Mexico       33592  Female   
15069                 Spain       21983    Male   
15128                Mexico       29784  Female   
15166                Mexico       32811    Male   

       Out-of-School Study Time - Homework  \
180                                    0.0   
189                                   16.0   
193                                   30.0   
224                                    2.0   
265                                   14.0   
269                                   18.0   
372                                    1.0   
486                                    2.0   
504                                   10.0   
521                                    3.0   
528                                    9.0   
574                                    5.0   
590                                    4.0   
606                                    2.0   
665                                    3.0   
718                                   30.0   
1015                                   2.0   
1225                                  10.0   
1255                                   7.0   
1259                                   1.0   
1429                                   2.0   
1447                                  14.0   
1527                                  28.0   
1654                                  14.0   
1753                                   4.0   
1771                                   4.0   
1789                                  10.0   
1813                                  27.0   
1828                                   4.0   
1928                                   2.0   
...                                    ...   
12666                                 11.0   
12744                                  5.0   
12767                                  6.0   
12837                                 10.0   
12945                                  4.0   
13102                                  1.0   
13105                                 18.0   
13109                                  4.0   
13222                                  7.0   
13239                                  7.0   
13690                                  1.0   
13801                                 20.0   
13811                                  8.0   
13824                                  6.0   
13853                                  7.0   
13856                                  5.0   
14428                                  3.0   
14433                                  4.0   
14497                                  4.0   
14499                                  2.0   
14529                                  1.0   
14625                                  5.0   
14626                                  4.0   
14839                                  7.0   
14846                                  1.0   
14855                                  8.0   
14870                                 21.0   
15069                                  2.0   
15128                                  3.0   
15166                                 12.0   

       Out-of-School Study Time - Guided Homework  \
180                                           0.0   
189                                           2.0   
193                                          10.0   
224                                           1.0   
265                                           3.0   
269                                          15.0   
372                                           0.0   
486                                           2.0   
504                                           3.0   
521                                           0.0   
528                                           4.0   
574                                           0.0   
590                                           3.0   
606                                           2.0   
665                                           0.0   
718                                           2.0   
1015                                          0.0   
1225                                          1.0   
1255                                          4.0   
1259                                          1.0   
1429                                          0.0   
1447                                          5.0   
1527                                          4.0   
1654                                          1.0   
1753                                          0.0   
1771                                          0.0   
1789                                          1.0   
1813                                          0.0   
1828                                          1.0   
1928                                          2.0   
...                                           ...   
12666                                         4.0   
12744                                         3.0   
12767                                         1.0   
12837                                         0.0   
12945                                         4.0   
13102                                         0.0   
13105                                         3.0   
13109                                         1.0   
13222                                         7.0   
13239                                         0.0   
13690                                         0.0   
13801                                         9.0   
13811                                         3.0   
13824                                         2.0   
13853                                         2.0   
13856                                         2.0   
14428                                         3.0   
14433                                         3.0   
14497                                         0.0   
14499                                         1.0   
14529                                         0.0   
14625                                         0.0   
14626                                         2.0   
14839                                         3.0   
14846                                         1.0   
14855                                         3.0   
14870                                         5.0   
15069                                         0.0   
15128                                         0.0   
15166                                         8.0   

       Out-of-School Study Time - Personal Tutor  \
180                                          0.0   
189                                          6.0   
193                                          0.0   
224                                          6.0   
265                                          3.0   
269                                          0.0   
372                                          0.0   
486                                          0.0   
504                                          1.0   
521                                          0.0   
528                                          0.0   
574                                          0.0   
590                                          8.0   
606                                          0.0   
665                                          0.0   
718                                          0.0   
1015                                         0.0   
1225                                         0.0   
1255                                         4.0   
1259                                         0.0   
1429                                         0.0   
1447                                         2.0   
1527                                         4.0   
1654                                         0.0   
1753                                         0.0   
1771                                         0.0   
1789                                         6.0   
1813                                         0.0   
1828                                         0.0   
1928                                         1.0   
...                                          ...   
12666                                        0.0   
12744                                        0.0   
12767                                        0.0   
12837                                        2.0   
12945                                        0.0   
13102                                        0.0   
13105                                        0.0   
13109                                        0.0   
13222                                        4.0   
13239                                        0.0   
13690                                        0.0   
13801                                        7.0   
13811                                        0.0   
13824                                        0.0   
13853                                        0.0   
13856                                        2.0   
14428                                        0.0   
14433                                        2.0   
14497                                        6.0   
14499                                        0.0   
14529                                        0.0   
14625                                        0.0   
14626                                        2.0   
14839                                        0.0   
14846                                        0.0   
14855                                        0.0   
14870                                        2.0   
15069                                        0.0   
15128                                        0.0   
15166                                        0.0   

       Out-of-School Study Time - Commercial Company  \
180                                              0.0   
189                                              0.0   
193                                              0.0   
224                                              6.0   
265                                              0.0   
269                                              6.0   
372                                              0.0   
486                                              0.0   
504                                              7.0   
521                                              0.0   
528                                              0.0   
574                                              2.0   
590                                              8.0   
606                                              0.0   
665                                              0.0   
718                                              0.0   
1015                                             0.0   
1225                                             0.0   
1255                                             7.0   
1259                                             0.0   
1429                                             0.0   
1447                                             0.0   
1527                                            16.0   
1654                                            11.0   
1753                                             0.0   
1771                                             0.0   
1789                                             1.0   
1813                                             0.0   
1828                                             0.0   
1928                                             2.0   
...                                              ...   
12666                                            0.0   
12744                                            0.0   
12767                                            0.0   
12837                                            0.0   
12945                                            0.0   
13102                                            0.0   
13105                                            0.0   
13109                                            0.0   
13222                                            4.0   
13239                                            0.0   
13690                                            0.0   
13801                                            0.0   
13811                                            0.0   
13824                                            0.0   
13853                                            0.0   
13856                                            0.0   
14428                                            0.0   
14433                                            0.0   
14497                                            0.0   
14499                                            1.0   
14529                                            0.0   
14625                                            0.0   
14626                                            1.0   
14839                                            0.0   
14846                                            0.0   
14855                                            0.0   
14870                                            0.0   
15069                                            0.0   
15128                                            0.0   
15166                                            0.0   

       Out-of-School Study Time - With Parent  Learning Time - Mathematics  \
180                                       0.0                        450.0   
189                                       4.0                        400.0   
193                                       1.0                        270.0   
224                                       0.0                        480.0   
265                                       0.0                        200.0   
269                                       0.0                        315.0   
372                                       1.0                        360.0   
486                                       2.0                        720.0   
504                                       4.0                        280.0   
521                                       0.0                        270.0   
528                                       4.0                        400.0   
574                                       0.0                        225.0   
590                                       1.0                        250.0   
606                                       2.0                        280.0   
665                                       0.0                        360.0   
718                                       2.0                        240.0   
1015                                      0.0                        455.0   
1225                                      0.0                        270.0   
1255                                      3.0                        350.0   
1259                                      1.0                        750.0   
1429                                      0.0                        180.0   
1447                                     10.0                        270.0   
1527                                      2.0                        315.0   
1654                                      2.0                        280.0   
1753                                      0.0                        315.0   
1771                                      1.0                        225.0   
1789                                      2.0                        200.0   
1813                                      1.0                        830.0   
1828                                      0.0                        225.0   
1928                                      1.0                        640.0   
...                                       ...                          ...   
12666                                     2.0                        200.0   
12744                                     7.0                        250.0   
12767                                     0.0                        270.0   
12837                                     0.0                        800.0   
12945                                     1.0                        500.0   
13102                                     0.0                        240.0   
13105                                     2.0                        440.0   
13109                                     1.0                       1440.0   
13222                                     7.0                        200.0   
13239                                     0.0                        400.0   
13690                                     1.0                        520.0   
13801                                     2.0                        540.0   
13811                                     1.0                        480.0   
13824                                     1.0                        250.0   
13853                                     3.0                        420.0   
13856                                     0.0                        480.0   
14428                                     0.0                        360.0   
14433                                     0.0                        500.0   
14497                                     0.0                        220.0   
14499                                     0.0                        250.0   
14529                                     0.0                        650.0   
14625                                     0.0                          0.0   
14626                                     5.0                        300.0   
14839                                     3.0                        240.0   
14846                                     1.0                        300.0   
14855                                     0.0                        400.0   
14870                                     8.0                        240.0   
15069                                     0.0                        720.0   
15128                                     1.0                        360.0   
15166                                     5.0                        225.0   

       Learning Time - Test Language  Learning Time - Science  \
180                            100.0                    650.0   
189                            400.0                    900.0   
193                            270.0                    900.0   
224                            480.0                   1080.0   
265                            160.0                   1800.0   
269                            320.0                    825.0   
372                            540.0                    720.0   
486                            720.0                    720.0   
504                            200.0                    760.0   
521                            270.0                    675.0   
528                            480.0                    960.0   
574                            225.0                    720.0   
590                            250.0                    700.0   
606                            280.0                    720.0   
665                            360.0                    720.0   
718                            240.0                    720.0   
1015                           390.0                    650.0   
1225                           180.0                    630.0   
1255                           100.0                    750.0   
1259                           750.0                    750.0   
1429                           720.0                    720.0   
1447                           270.0                    675.0   
1527                           180.0                   1080.0   
1654                           280.0                    720.0   
1753                           180.0                    675.0   
1771                           225.0                    675.0   
1789                           200.0                    960.0   
1813                             0.0                    830.0   
1828                           270.0                    765.0   
1928                           240.0                    640.0   
...                              ...                      ...   
12666                          280.0                    720.0   
12744                          200.0                    650.0   
12767                          270.0                   1080.0   
12837                          600.0                    800.0   
12945                          400.0                    800.0   
13102                          240.0                   1800.0   
13105                          165.0                    660.0   
13109                          600.0                    720.0   
13222                          850.0                    700.0   
13239                          150.0                    900.0   
13690                          580.0                    725.0   
13801                          225.0                    900.0   
13811                          240.0                    720.0   
13824                          250.0                    700.0   
13853                          420.0                    960.0   
13856                          540.0                    720.0   
14428                          240.0                    840.0   
14433                          400.0                    700.0   
14497                          660.0                    660.0   
14499                          150.0                    720.0   
14529                          800.0                    750.0   
14625                          350.0                    700.0   
14626                          300.0                    720.0   
14839                          240.0                   1560.0   
14846                          300.0                    720.0   
14855                          150.0                    700.0   
14870                          240.0                   1920.0   
15069                          720.0                    840.0   
15128                          600.0                    720.0   
15166                          135.0                    720.0   

       Average Math Score  Average Reading Score  Average Science Score  \
180             446.80636              357.64926              434.85640   
189             522.75278              499.03216              529.87692   
193             596.05078              619.64524              599.62700   
224             455.68624              455.64694              472.90192   
265             560.21968              517.39670              559.62326   
269             681.42236              577.22182              665.55390   
372             546.97772              537.44536              585.26668   
486             377.01356              366.79140              421.24208   
504             440.88644              497.71868              504.60650   
521             579.45942              598.79414              596.27004   
528             505.38246              544.90066              529.22418   
574             492.14052              417.47436              508.89594   
590             521.89594              550.46086              472.34240   
606             549.31454              523.49150              536.31108   
665             452.72628              501.21362              506.28496   
718             722.23866              727.34592              759.73518   
1015            579.69308              581.04178              511.41364   
1225            605.86540              613.29080              604.56918   
1255            569.41112              596.21310              564.28572   
1259            486.29850              465.54912              497.33308   
1429            478.27542              548.23678              490.43268   
1447            485.05218              505.97950              531.36888   
1527            622.76834              650.54392              593.65908   
1654            580.00466              506.49028              568.10892   
1753            571.82582              612.09932              543.11822   
1771            682.82444              640.01412              654.27078   
1789            482.63750              556.97420              513.74488   
1813            468.38292              518.60900              520.83178   
1828            554.14394              513.54738              550.48490   
1928            605.24222              635.29316              589.27638   
...                   ...                    ...                    ...   
12666           473.05654              530.60310              521.48452   
12744           514.96342              566.18818              492.01790   
12767           547.28928              589.25098              581.44348   
12837           520.41596              566.90310              514.49086   
12945           462.15144              447.22650              470.19768   
13102           515.82024              540.65312              490.43268   
13105           522.83068              543.78866              463.39052   
13109           427.95608              402.39778              431.12644   
13222           364.55056              343.21422              310.83548   
13239           363.84952              454.50820              371.72698   
13690           321.24162              344.17656              352.05146   
13801           570.50162              595.50614              582.93548   
13811           537.86414              508.57530              474.95338   
13824           436.60226              419.31882              394.01342   
13853           415.02570              458.63862              426.83702   
13856           434.65490              436.55678              432.24542   
14428           419.30984              452.75994              445.02052   
14433           481.39120              540.77026              431.49942   
14497           398.04486              450.67488              445.20700   
14499           467.68186              576.35536              472.15592   
14529           355.51488              348.10612              328.55278   
14625           478.27546              535.05124              437.84036   
14626           493.46472              497.48038              531.36888   
14839           379.89560              386.35886              362.02910   
14846           405.21112              399.18998              360.44386   
14855           408.17108              446.98594              426.74376   
14870           410.19632              442.75242              412.10370   
15069           376.23462              383.87286              485.95676   
15128           373.43044              361.73278              371.44722   
15166           369.06838              396.94456              320.90636   

       Average Total Score        Education - Father  \
180             413.104007  Bachelor’s or equivalent   
189             517.220620  Bachelor’s or equivalent   
193             605.107673  Bachelor’s or equivalent   
224             461.411700  Bachelor’s or equivalent   
265             545.746547  Bachelor’s or equivalent   
269             641.399360  Bachelor’s or equivalent   
372             556.563253  Bachelor’s or equivalent   
486             388.349013  Bachelor’s or equivalent   
504             481.070540  Bachelor’s or equivalent   
521             591.507867  Bachelor’s or equivalent   
528             526.502433  Bachelor’s or equivalent   
574             472.836940  Bachelor’s or equivalent   
590             514.899733  Bachelor’s or equivalent   
606             536.372373  Bachelor’s or equivalent   
665             486.741620  Bachelor’s or equivalent   
718             736.439920  Bachelor’s or equivalent   
1015            557.382833  Bachelor’s or equivalent   
1225            607.908460  Bachelor’s or equivalent   
1255            576.636647  Bachelor’s or equivalent   
1259            483.060233  Bachelor’s or equivalent   
1429            505.648293      Short-cycle tertiary   
1447            507.466853  Bachelor’s or equivalent   
1527            622.323780  Bachelor’s or equivalent   
1654            551.534620  Bachelor’s or equivalent   
1753            575.681120  Bachelor’s or equivalent   
1771            659.036447  Bachelor’s or equivalent   
1789            517.785527  Bachelor’s or equivalent   
1813            502.607900  Bachelor’s or equivalent   
1828            539.392073  Bachelor’s or equivalent   
1928            609.937253  Bachelor’s or equivalent   
...                    ...                       ...   
12666           508.381387  Bachelor’s or equivalent   
12744           524.389833           Upper secondary   
12767           572.661247           Upper secondary   
12837           533.936640  Bachelor’s or equivalent   
12945           459.858540      Short-cycle tertiary   
13102           515.635347      Short-cycle tertiary   
13105           510.003287      Short-cycle tertiary   
13109           420.493433      Short-cycle tertiary   
13222           339.533420      Short-cycle tertiary   
13239           396.694900      Short-cycle tertiary   
13690           339.156547           Lower secondary   
13801           582.981080                   Primary   
13811           507.130940                   Primary   
13824           416.644833                   Primary   
13853           433.500447                   Primary   
13856           434.485700                   Primary   
14428           439.030100           Upper secondary   
14433           484.553627           Upper secondary   
14497           431.308913      Short-cycle tertiary   
14499           505.397713      Short-cycle tertiary   
14529           344.057927            Post-secondary   
14625           483.722353  Bachelor’s or equivalent   
14626           507.437993  Bachelor’s or equivalent   
14839           376.094520           Lower secondary   
14846           388.281653           Upper secondary   
14855           427.300260           Upper secondary   
14870           421.684147           Upper secondary   
15069           415.354747      Short-cycle tertiary   
15128           368.870147           Upper secondary   
15166           362.306433      Short-cycle tertiary   

             Education - Mother  Out-of-School Study Time - Total  \
180    Bachelor’s or equivalent                               0.0   
189    Bachelor’s or equivalent                              28.0   
193    Bachelor’s or equivalent                              41.0   
224    Bachelor’s or equivalent                              15.0   
265    Bachelor’s or equivalent                              20.0   
269    Bachelor’s or equivalent                              39.0   
372    Bachelor’s or equivalent                               2.0   
486    Bachelor’s or equivalent                               6.0   
504    Bachelor’s or equivalent                              25.0   
521    Bachelor’s or equivalent                               3.0   
528    Bachelor’s or equivalent                              17.0   
574    Bachelor’s or equivalent                               7.0   
590    Bachelor’s or equivalent                              24.0   
606    Bachelor’s or equivalent                               6.0   
665    Bachelor’s or equivalent                               3.0   
718    Bachelor’s or equivalent                              34.0   
1015   Bachelor’s or equivalent                               2.0   
1225   Bachelor’s or equivalent                              11.0   
1255   Bachelor’s or equivalent                              25.0   
1259   Bachelor’s or equivalent                               3.0   
1429   Bachelor’s or equivalent                               2.0   
1447   Bachelor’s or equivalent                              31.0   
1527   Bachelor’s or equivalent                              54.0   
1654   Bachelor’s or equivalent                              28.0   
1753   Bachelor’s or equivalent                               4.0   
1771   Bachelor’s or equivalent                               5.0   
1789   Bachelor’s or equivalent                              20.0   
1813   Bachelor’s or equivalent                              28.0   
1828   Bachelor’s or equivalent                               5.0   
1928   Bachelor’s or equivalent                               8.0   
...                         ...                               ...   
12666           Upper secondary                              17.0   
12744           Upper secondary                              15.0   
12767           Upper secondary                               7.0   
12837           Upper secondary                              12.0   
12945           Lower secondary                               9.0   
13102           Lower secondary                               1.0   
13105           Lower secondary                              23.0   
13109           Lower secondary                               6.0   
13222           Lower secondary                              29.0   
13239           Lower secondary                               7.0   
13690           Lower secondary                               2.0   
13801           Lower secondary                              38.0   
13811           Lower secondary                              12.0   
13824           Lower secondary                               9.0   
13853           Lower secondary                              12.0   
13856           Lower secondary                               9.0   
14428           Lower secondary                               6.0   
14433           Lower secondary                               9.0   
14497                   Primary                              10.0   
14499                   Primary                               4.0   
14529                   Primary                               1.0   
14625                   Primary                               5.0   
14626                   Primary                              14.0   
14839                   Primary                              13.0   
14846                   Primary                               3.0   
14855                   Primary                              11.0   
14870                   Primary                              36.0   
15069           Early childhood                               2.0   
15128           Early childhood                               4.0   
15166           Early childhood                              25.0   

       Learning Time - Total  log_study  
180                   1200.0       -inf  
189                   1700.0   1.447158  
193                   1440.0   1.612784  
224                   2040.0   1.176091  
265                   2160.0   1.301030  
269                   1460.0   1.591065  
372                   1620.0   0.301030  
486                   2160.0   0.778151  
504                   1240.0   1.397940  
521                   1215.0   0.477121  
528                   1840.0   1.230449  
574                   1170.0   0.845098  
590                   1200.0   1.380211  
606                   1280.0   0.778151  
665                   1440.0   0.477121  
718                   1200.0   1.531479  
1015                  1495.0   0.301030  
1225                  1080.0   1.041393  
1255                  1200.0   1.397940  
1259                  2250.0   0.477121  
1429                  1620.0   0.301030  
1447                  1215.0   1.491362  
1527                  1575.0   1.732394  
1654                  1280.0   1.447158  
1753                  1170.0   0.602060  
1771                  1125.0   0.698970  
1789                  1360.0   1.301030  
1813                  1660.0   1.447158  
1828                  1260.0   0.698970  
1928                  1520.0   0.903090  
...                      ...        ...  
12666                 1200.0   1.230449  
12744                 1100.0   1.176091  
12767                 1620.0   0.845098  
12837                 2200.0   1.079181  
12945                 1700.0   0.954243  
13102                 2280.0   0.000000  
13105                 1265.0   1.361728  
13109                 2760.0   0.778151  
13222                 1750.0   1.462398  
13239                 1450.0   0.845098  
13690                 1825.0   0.301030  
13801                 1665.0   1.579784  
13811                 1440.0   1.079181  
13824                 1200.0   0.954243  
13853                 1800.0   1.079181  
13856                 1740.0   0.954243  
14428                 1440.0   0.778151  
14433                 1600.0   0.954243  
14497                 1540.0   1.000000  
14499                 1120.0   0.602060  
14529                 2200.0   0.000000  
14625                 1050.0   0.698970  
14626                 1320.0   1.146128  
14839                 2040.0   1.113943  
14846                 1320.0   0.477121  
14855                 1250.0   1.041393  
14870                 2400.0   1.556303  
15069                 2280.0   0.301030  
15128                 1680.0   0.602060  
15166                 1080.0   1.397940  

[180 rows x 20 columns]

Since the amount of outliers is so low and they do not bring exceptionally relevant information to the analysis, it will be better if we continue without them.

In [42]:
# Remove outliers
pisa = pisa.loc[-high_outliers_math & -high_outliers_lang & -high_outliers_sci,:]
In [43]:
# Re-plotting the distributions of Learning Times
fig, ax = plt.subplots(nrows=3, figsize = [18,20])

variables = ['Learning Time - Mathematics', 'Learning Time - Test Language', 'Learning Time - Science']
for i in range(len(variables)):
    var = variables[i]
    ax[i].hist(data = pisa, x = var, color=color3)
    ax[i].set_xlabel('{} (mins/week)'.format(var))
    ax[i].set_ylabel('Frequency')
    ax[i].set_title('{}'.format(var))


plt.show()

Last but not least, we still have the parental education levels to analyze.

In [44]:
# The ordinal variable's distribution for both Mother's and Father's Education 
fig, ax = plt.subplots(nrows=2, figsize = [18,18])

default_color = sb.color_palette()[0]
sb.countplot(data = pisa, x = 'Education - Father', color = color_male, ax = ax[0])
sb.countplot(data = pisa, x = 'Education - Mother', color = color_female, ax = ax[1])

plt.show()

Here it shows that the students that exist in this dataset typically have parents of higher educational levels. Short-cycle Tertiary education takes a clear majority for both mother and father, while parents with just Early Childhood education have the lowest amount of children in this dataset.

Discuss the distribution(s) of your variable(s) of interest. Were there any unusual points? Did you need to perform any transformations?

For 'Average Total Score', the distribution was strikingly normal. However, this was expected to an extent, since student grades typically fall along a bell curve. As a result, no unusual points stood out for this variable, nor did any stand out for the three scores that resulted in the total score. Therefore, no transformations were necessary to make sense of the data.

Of the features you investigated, were there any unusual distributions? Did you perform any operations on the data to tidy, adjust, or change the form of the data? If so, why did you do this?

The secondary features investigated were Study Times, Learning Times, and Parental Education.

For Study Times, the total had a strong right skew, as did the rest of the Study Times that the total was composed of. To better understand this feature, we spread the total across a logarithmic scale to see if it was not in fact unimodal or to see any other irregularities. In the end, it ended up being unimodal and quite normal.

As for the Learning Time, this data clearly had outliers, so for each of the Learning Time's, the outliers over 600 minutes were excluded. This was done to look at the more typical student results, and so that later plots will not be distorted by these exceptionally dedicated students.

And the Parents Education variables have a bit too much weight on parents with higher educational levels, but considering the plots we will run, this should not have a great impact so we will leave it as is.

Bivariate Exploration

Out-of-School Study Time and Learning Time

To start off, let's look at the correlations between each of the Scores, the Total Out-of-School Study Time, and the Total Learning Time to see if the amount of time dedicated to a subject has an influence on the score, and how strongly the Scores are correlated with one another. This will help us answer the question of whether or not there is a relationship between the amount of time a student dedicates to learning and their score.

In [45]:
numeric_vars = ['Average Math Score', 'Average Reading Score', 'Average Science Score', 'Average Total Score', 'Out-of-School Study Time - Total', 'Learning Time - Total']
In [46]:
# Correlation plot
plt.figure(figsize = [8, 5])
sb.heatmap(pisa[numeric_vars].corr(), annot = True, fmt = '.3f',
           cmap = 'BrBG', center = 0)
plt.show()

Considering the correlations between the Scores, the Total Out-of-School Study Time and Total Learning Time, we can see that the Total Learning Time is slightly better correlated with the scores than the Total Out-of-School Study Time, with the Average Reading Score being the exception.

To better understand the relationship between the Scores and the Learning Time, lets look at the breakdown of each of the Learning Time per subject.

In [47]:
score_learn_vars = ['Average Math Score', 'Average Reading Score', 'Average Science Score', 
                'Average Total Score', 'Learning Time - Mathematics',
                'Learning Time - Test Language', 'Learning Time - Science', 
                'Learning Time - Total']
In [48]:
# correlation plot
plt.figure(figsize = [8, 5])
sb.heatmap(pisa[score_learn_vars].corr(), annot = True, fmt = '.3f',
           cmap = 'BrBG', center = 0)
plt.show()

Interestingly, we can see that the Learning Time for Mathematics and the Test Language have no correlation at all with any of the Scores when compared to the Learning Time for Science.

We can look at these variables now through another perspective: seeing the scatter plot relationships between them.

In [49]:
samples = np.random.choice(pisa.shape[0], 500, replace = False)
pisa_samp = pisa.loc[samples,:]

g = sb.PairGrid(data = pisa_samp, vars = score_learn_vars)
g = g.map_diag(plt.hist, bins = 20, color='#ffcd60');
g.map_offdiag(plt.scatter, color = color1);
/Users/gabriela/Desktop/Code/myenv/lib/python3.7/site-packages/pandas/core/indexing.py:1494: FutureWarning: 
Passing list-likes to .loc or [] with any missing label will raise
KeyError in the future, you can use .reindex() as an alternative.

See the documentation here:
https://pandas.pydata.org/pandas-docs/stable/indexing.html#deprecate-loc-reindex-listlike
  return self._getitem_tuple(key)
/Users/gabriela/Desktop/Code/myenv/lib/python3.7/site-packages/numpy/lib/histograms.py:824: RuntimeWarning: invalid value encountered in greater_equal
  keep = (tmp_a >= first_edge)
/Users/gabriela/Desktop/Code/myenv/lib/python3.7/site-packages/numpy/lib/histograms.py:825: RuntimeWarning: invalid value encountered in less_equal
  keep &= (tmp_a <= last_edge)

As expected, we can clearly see a strong positive correlation between each of the Scores. As for the relationships between the Learning Times, a positive relationship is visual between each of them, albeit not very strong with the exception of some outliers.

When it comes to the relationship between the Scores and Learning Times, we can see that the amount of time a study spends learning a topic has no relationship with the Score that they will receive according to this plot.

In [50]:
score_study_vars = ['Average Math Score', 'Average Reading Score', 'Average Science Score', 
                'Average Total Score', 'Out-of-School Study Time - Homework',
                      'Out-of-School Study Time - Guided Homework',
                      'Out-of-School Study Time - Personal Tutor',
                      'Out-of-School Study Time - Commercial Company',
                      'Out-of-School Study Time - With Parent', 
                      'Out-of-School Study Time - Total']
In [51]:
# correlation plot
plt.figure(figsize = [8, 5])
sb.heatmap(pisa[score_study_vars].corr(), annot = True, fmt = '.3f',
           cmap = 'BrBG', center = 0)
plt.show()

The results of this correlation plot are noteworthy in that it indicates that study time in terms of Guided Homework, with Personal Tutor, with a Commercial Company, and with a Parent have no positive influence on the score of a student. This could be related to the fact that the students who do need this amount of help are already the ones who struggle with grades, but since we have no information on previous Scores of said students, we cannot explore this theory any further for now.

We can however, look deeper into the role of Homework in the students Score.

In [52]:
score_study_vars = ['Average Math Score', 'Average Reading Score', 'Average Science Score', 
                'Average Total Score', 'Out-of-School Study Time - Homework']
In [53]:
samples = np.random.choice(pisa.shape[0], 500, replace = False)
pisa_samp = pisa.loc[samples,:]

g = sb.PairGrid(data = pisa_samp, vars = score_study_vars)
g = g.map_diag(plt.hist, bins = 20, color='#ffcd60');
g.map_offdiag(plt.scatter, color = color1);

Although the relationship between Homework Study Time and all the various Scores is weak, we can see that the more time a student spends on Homework, the higher their Score is. But this relationship only really exists until the Score is about 450. So for the students who are at the bottom of the Scoring rank spend time doing Homework, then they can move into the average Scores. As for the higher Scores, seems like they are generally unaffected.

Lastly, let's look at the relationship between the Study Time and Learning Time variables to see if they strongly with one another in any interesting way.

In [54]:
time_vars = ['Out-of-School Study Time - Homework',
                      'Out-of-School Study Time - Guided Homework',
                      'Out-of-School Study Time - Personal Tutor',
                      'Out-of-School Study Time - Commercial Company',
                      'Out-of-School Study Time - With Parent',
                      'Learning Time - Mathematics',
                      'Learning Time - Test Language',
                      'Learning Time - Science']
In [55]:
# correlation plot
plt.figure(figsize = [8, 5])
sb.heatmap(pisa[time_vars].corr(), annot = True, fmt = '.3f',
           cmap = 'BrBG', center = 0)
plt.show()

When it comes to the Study Times and Learning Times, no relationship is visible, and they barely have any relationships with the categories themselves. So, we cannot say that certain students study within school and outside of school more than others. In general for this section, we cannot see much of an influence from Time spent learning on Scores.

Now we can see our next set of factors that might influence the Score of a student:

Parental Education and Gender

To start off, let's look at the distribution of each level of education and the frequency of each.

In [56]:
g = sb.FacetGrid(data = pisa, col = 'Education - Mother');
g.map(plt.hist, 'Average Total Score', color = color_female);

Here we can see that the children in this dataset frequently have mothers with a Short-cycle Tertiary Education. In terms of Scores for each level, children with mothers who have just Early Childhood education perform much worse, with distribution that does not even reach the Score of 600. Meanwhile, the highest level of Bachelor's or equivalent is slightly left skewed and goes past the 600 mark.

In [57]:
g = sb.FacetGrid(data = pisa, col = 'Education - Father');
g.map(plt.hist, 'Average Total Score', color = color_male);

The same can be said for the education levels for the fathers. Except here we have more fathers with Bachelor's or equivalent educations.

Next we can look at the distribution for each of these levels to see the range and medians better.

In [58]:
plt.figure(figsize=[18,8])
sb.violinplot(data = pisa, 
              x = 'Education - Father', 
              y = 'Average Total Score',
              color = color_male)
plt.title('Average Total Score Across Education Levels of Father');

Interestingly, the spread is quite large for the children of higher educated fathers. In fact, it appears that the child who performed worst had a father with Short-cycle Tertiary education. Meanwhile, the children with parents who have only Early Childhood education seem to have a much smaller range and exist to a much greater extent around the median.

In [59]:
plt.figure(figsize=[18,8])
sb.violinplot(data = pisa, 
              x = 'Education - Mother', 
              y = 'Average Total Score', 
              color = color_female)
plt.title('Average Total Score Across Education Levels of Mother');

The violin plot for the Mother's Education is more along the lines of what we expect, with the median growing from one level to the next, and each of which has a reasonable spread.

But to see the extent to which the outliers play a role, we can look at the same data with box plots.

In [60]:
plt.figure(figsize=[18,8])
sb.boxplot(data = pisa, 
              x = 'Education - Father', 
              y = 'Average Total Score',
              color = color_male);
plt.title('Average Total Score Across Education Levels of Father');

Once again we can see the student who performs lowest overall is an outlier for the Short-cycle Tertiary level, and in general the same trend exists.

In [61]:
plt.figure(figsize=[18,8])
sb.boxplot(data = pisa, 
              x = 'Education - Mother', 
              y = 'Average Total Score',
              color = color_female)
plt.title('Average Total Score Across Education Levels of Mother');

Here we can see that for the lower education levels for the mother, the students are generally achieving lower grades, but there are a good amount of high score outliers. While on the other half of the educational levels, there is a tendency for high grades with a few low score outliers.

In [62]:
# Score averages of students vs education levels of Father
plt.figure(figsize=[18,8])
sb.pointplot(data = pisa, 
              x = 'Education - Father', 
              y = 'Average Total Score',
             color = color_male)

# Score averages of students vs education levels of Mother
sb.pointplot(data = pisa, 
              x = 'Education - Mother', 
              y = 'Average Total Score',
              color = color_female)

plt.title('Average Total Score Across Education Levels of Parents')

# Set legend
plt.legend(labels=['Fathers Education', 'Mothers Education'])
# https://stackoverflow.com/questions/23698850/manually-set-color-of-points-in-legend
ax = plt.gca()
leg = ax.get_legend()
leg.legendHandles[0].set_color(color_male)
leg.legendHandles[1].set_color(color_female);

In general, we can see that the student Scores grow with the education level of the parent, regardless of the gender of the parent, until a point where it seems to plateau.

Now we can move towards looking at the gender of the child as well.

In [63]:
plt.figure(figsize=[10,8])

sb.boxplot(data = pisa, 
              x = 'Gender', 
              y = 'Average Total Score',
              palette = color_gends);

If we look at the role that Gender plays on the Score, the range seems to match. However, the males seem to dip lower with their outliers.

In [64]:
plt.figure(figsize=[18,7])

sb.countplot(data = pisa, x = 'Education - Father', hue = 'Gender', palette = color_gends);

Here we can see how many female and male children have parents that fall into the educational levels. It's generally about the same, except for Bachelor's or equivalent, where there are many more males than females.

Now we can look at whether gender plays a role in the Score of a student.

In [65]:
# Create a subset to better see comparison plots
np.random.seed(2018)
sample = np.random.choice(pisa.shape[0], 200, replace=False)
pisa_subset = pisa.loc[sample]
/Users/gabriela/Desktop/Code/myenv/lib/python3.7/site-packages/ipykernel_launcher.py:4: FutureWarning: 
Passing list-likes to .loc or [] with any missing label will raise
KeyError in the future, you can use .reindex() as an alternative.

See the documentation here:
https://pandas.pydata.org/pandas-docs/stable/indexing.html#deprecate-loc-reindex-listlike
  after removing the cwd from sys.path.
In [68]:
g = sb.FacetGrid(data = pisa_subset, hue = 'Gender', palette = color_gends, height=5)
g.map(sb.regplot, 'Average Total Score', 'Average Reading Score', fit_reg = False)
plt.legend();

Here we can see that females have a tendency for higher Reading Scores, and males have a tendency for higher Math Scores.

In [69]:
g = sb.FacetGrid(data = pisa_subset, hue = 'Gender', palette = color_gends, height=5)
g.map(sb.regplot, 'Average Total Score', 'Average Science Score', fit_reg = False)
plt.legend();

The same separation cannot be made when comparing Math to Science for male and female. They seem to overlap completely.

In [70]:
g = sb.FacetGrid(data = pisa_subset, hue = 'Gender', palette = color_gends, height=5)
g.map(sb.regplot, 'Average Total Score', 'Average Math Score', fit_reg = False)
plt.legend();

Once again, females outperform a bit when it comes to males and the Reading Score.

In [111]:
g = sb.FacetGrid(data = pisa_subset, hue = 'Gender', palette = color_gends, height=5)
g.map(sb.regplot, 'Average Total Score', 'Out-of-School Study Time - Homework', fit_reg = False)
plt.legend();

When it comes to the one Out-of-School Study Time variable that had any noteworthy correlation from before, the Homework variable here has a negligible relationship to Score, as well as Gender.

In [109]:
g = sb.FacetGrid(data = pisa_subset, hue = 'Gender', palette = color_gends, height=5)
g.map(sb.regplot, 'Average Science Score', 'Learning Time - Science', fit_reg = False)
plt.legend();

As with the Out-of-School Study Time variable, we can look at the Science Score vs. the Science Score here since it was the strongest relationship. Once again, the effect of Gender is not visible.

Talk about some of the relationships you observed in this part of the investigation. How did the feature(s) of interest vary with other features in the dataset?

In this section it became visible that the Scores were less influenced by Out-of-School Study Time and Learning Time than expected. For Learning Time in school, we saw that Science had a more positive correlation with each of the Scores than the Math and Reading Learning Times.

The scores were however strongly associated to the Educational level of the parents. We saw that the higher the level of education of either the mother or father, the higher the score of the student is more likely to be, on average at least. Also, we saw that the female students slightly outperformed the male students on the Average Reading Score, but generally the females and males performed the same throughout.

Did you observe any interesting relationships between the other features (not the main feature(s) of interest)?

Interestingly enough, Out-of-School Study Time and Learning Time were not as significant as I had expected. In particular, we can see that the only significant and positively correlated Out-of-School Study Time variable was Homework, and the rest were correlated in a weak negative way to the student's score.

Multivariate Exploration

To start off this section of exploration, let's continue the box plots and gender comparisons from before.

In [71]:
plt.figure(figsize=[18,10])
sb.boxplot(data = pisa, 
              x = 'Education - Father', 
              y = 'Average Total Score',
              hue = 'Gender',
              palette = color_gends);
In [71]:
plt.figure(figsize=[18,10])
sb.boxplot(data = pisa, 
              x = 'Education - Mother', 
              y = 'Average Total Score',
              hue = 'Gender',
              palette = color_gends);
plt.show();

Here we answer one of the original questions of whether there differences in achievement based on gender or parental education levels. For both Father and Mother, we can see a negligible difference between males and females for all levels. The widest gap between the two genders exists for the Primary education level for both Father and Mother, but the proportion of students in this category is small, and the medians are nevertheless similar enough.

And when it comes to the educational levels of the parents, well those definitely play a role in how successful a student tends to be. There is of course a dramatic spread in both directions and outliers, but it seems that the median Score for students is closely related to the educational level of either Mother or Father.

Now we can observe the relationship between Learning Times and their respective subjects.

In [129]:
# Faceted scatter plots on levels of father's education
g = sb.FacetGrid(data = pisa, col = 'Education - Father', col_wrap = 4, height = 5)
g.map(sb.regplot, 'Average Science Score', 'Learning Time - Science', 
      color = color1,  x_jitter = 0.3,
      scatter_kws = {'alpha' : 1/20}, 
      line_kws={"color": color3})
g.set_xlabels('Average Science Score')
g.set_ylabels('Learning Time(mins/week)- Science')

plt.show()

Previously, we saw that the amount of Learning Time for Science looked promising when it came to its correlation to its corresponding Score, the Average Science Score, at least in comparison to the other pairs. However, when we look at the regression plots we see here, we can see there might be a separation between the students. The line of regression appears to be showing a negative correlation between Learning Time for Science and the Average Science Score for the students whose Fathers achieved Primary, Lower secondary, and Upper secondary education. On the other hand, with the 3 highest levels of education in our dataframe, Post-secondary, Short-cycle tertiary, and Bachelor's or equivalent, we can see a positive correlation. This might indicate that the higher the education of the father, the more likely that Science related Learning Time in school will produce a higher grade.

In [73]:
# Faceted scatter plots on levels of mother's education
g = sb.FacetGrid(data = pisa, col = 'Education - Mother', col_wrap = 4, height = 5)
g.map(sb.regplot, 'Average Science Score', 'Learning Time - Science', color = color_female,  x_jitter = 0.3,
      scatter_kws = {'alpha' : 1/20}, 
      line_kws={"color": line})
g.set_xlabels('Average Science Score')
g.set_ylabels('Learning Time(mins/week)- Science')

plt.show()

To solidify add support to the argument that the more Science related Learning Time in school there is, the better the Science Score of the student will be if the parental education is Post-secondary or higher, we can see that the results for the Mother's education match.

Considering this, it would be interesting if we saw similar results for the Mathematics and Reading related scores.

In [74]:
# Faceted heat maps on levels of father's education
g = sb.FacetGrid(data = pisa, col = 'Education - Father', col_wrap = 4, height = 5)
g.map(sb.regplot, 'Average Math Score', 'Learning Time - Mathematics', color = color1, x_jitter = 0.3,
      scatter_kws = {'alpha' : 1/20}, 
      line_kws={"color": color2})
g.set_xlabels('Average Math Score')
g.set_ylabels('Learning Time(mins/week)- Math')

plt.show()
In [75]:
# Faceted scatter plot on levels of mothers's education
g = sb.FacetGrid(data = pisa, col = 'Education - Mother', col_wrap = 4, height = 5)
g.map(sb.regplot, 'Average Math Score', 'Learning Time - Mathematics', color = color_female, x_jitter = 0.3,
      scatter_kws = {'alpha' : 1/20}, 
      line_kws={"color": line})
g.set_xlabels('Average Math Score')
g.set_ylabels('Learning Time(mins/week)- Math')

plt.show()

Considering the results for both Mother and Father, the results are far too unspectacular. We cannot conclude the same correlation as we could for the Science Learning Time and Score relationship. Here, the amount of Learning Time for Mathematics does not seem to play a role in the Math Score for a child.

In [76]:
# Faceted heat maps on levels of father's education
g = sb.FacetGrid(data = pisa, col = 'Education - Father', col_wrap = 4, height = 5)
g.map(sb.regplot, 'Average Reading Score', 'Learning Time - Test Language', color = color1, x_jitter = 0.3,
      scatter_kws = {'alpha' : 1/20}, 
      line_kws={"color": color2})
g.set_xlabels('Average Reading Score')
g.set_ylabels('Learning Time(mins/week)- Test Language')

plt.show()
In [77]:
# Faceted heat maps on levels of mother's education
g = sb.FacetGrid(data = pisa, col = 'Education - Mother', col_wrap = 4, height = 5)
g.map(sb.regplot, 'Average Reading Score', 'Learning Time - Test Language', color = color_female, x_jitter = 0.3,
      scatter_kws = {'alpha' : 1/20}, 
      line_kws={"color": line})
g.set_xlabels('Average Reading Score')
g.set_ylabels('Learning Time(mins/week)- Test Language')

plt.show()

Just as we saw for the Mathematics Learning Time and Score, we can see the same for Reading Score and Learning Time of the Test Language. There are no clear trends and we cannot conclude that the Learning Time plays a role in the success in the Reading Score.

So for the Learning Times, we can conclude that Science related Learning Time had the biggest influence on its corresponding Score, and the other two are negligible.

Now we can move onto Out-of-School Study Time. Previously, there was very few promising results out of the analysis, so let's see if analyzing the parental education levels might change the results.

In [78]:
# Faceted heat maps on levels of father's education
g = sb.FacetGrid(data = pisa, col = 'Education - Father', col_wrap = 4, height = 5)
g.map(sb.regplot, 'Average Total Score', 'Out-of-School Study Time - Total', 
      color = color2, 
      x_jitter = 0.3,
      scatter_kws = {'alpha' : 1/20}, 
      line_kws={"color": color3})
g.set_xlabels('Average Total Score')
g.set_ylabels('Out-of-School Study Time (h/week) - Total')

plt.show()
In [79]:
# Faceted heat maps on levels of father's education
g = sb.FacetGrid(data = pisa, col = 'Education - Mother', col_wrap = 4, height = 5)
g.map(sb.regplot, 'Average Total Score', 'Out-of-School Study Time - Total', 
      color = color_female, 
      x_jitter = 0.3,
      scatter_kws = {'alpha' : 1/20}, 
      line_kws={"color": line})
g.set_xlabels('Average Total Score')
g.set_ylabels('Out-of-School Study Time (h/week) - Total')

plt.show()

Here we can see the Total Out-of-School Study Time vs. the Average Total Score. It's very clear that there is no meaningful relationship between these two. We can look into each of the variables that made up the Total Out-of-School Study Time to see if there are any observable relationships.

In [80]:
# Faceted heat maps on levels of father's education
g = sb.FacetGrid(data = pisa, col = 'Education - Father', col_wrap = 4, height = 5)
g.map(sb.regplot, 'Average Total Score', 'Out-of-School Study Time - Guided Homework', 
      color = color2, 
      x_jitter = 0.3,
      scatter_kws = {'alpha' : 1/20}, 
      line_kws={"color": color3})
g.set_xlabels('Average Total Score')
g.set_ylabels('Out-of-School Study Time (h/week) - Guided Homework')

plt.show()
In [81]:
# Faceted heat maps on levels of father's education
g = sb.FacetGrid(data = pisa, col = 'Education - Mother', col_wrap = 4, height = 5)
g.map(sb.regplot, 'Average Total Score', 'Out-of-School Study Time - Guided Homework', 
      color = color_female, 
      x_jitter = 0.3,
      scatter_kws = {'alpha' : 1/20}, 
      line_kws={"color": line})
g.set_xlabels('Average Total Score')
g.set_ylabels('Out-of-School Study Time (h/week) - Guided Homework')

plt.show()

The relationship between Guided Homework Study Time and the Total Score does not look good. In fact, we see a subtle negative correlation for every Level of Education for both Fathers and Mothers.

In [82]:
# Faceted heat maps on levels of father's education
g = sb.FacetGrid(data = pisa, col = 'Education - Father', col_wrap = 4, height = 5)
g.map(sb.regplot, 'Average Total Score', 'Out-of-School Study Time - Personal Tutor', 
      color = color2, 
      x_jitter = 0.3,
      scatter_kws = {'alpha' : 1/20}, 
      line_kws={"color": color3})
g.set_xlabels('Average Total Score')
g.set_ylabels('Out-of-School Study Time (h/week) - Personal Tutor')

plt.show()
In [83]:
# Faceted heat maps on levels of father's education
g = sb.FacetGrid(data = pisa, col = 'Education - Mother', col_wrap = 4, height = 5)
g.map(sb.regplot, 'Average Total Score', 'Out-of-School Study Time - Personal Tutor', 
      color = color_female, 
      x_jitter = 0.3,
      scatter_kws = {'alpha' : 1/20}, 
      line_kws={"color": line})
g.set_xlabels('Average Total Score')
g.set_ylabels('Out-of-School Study Time (h/week) - Personal Tutor')

plt.show()

The same can be said for the Personal Tutors and Score. This could be of course due to the fact that students who need more time with Personal Tutors are already the ones who struggle, but that is a claim that is a little to large for this data analysis.

In [84]:
# Faceted heat maps on levels of father's education
g = sb.FacetGrid(data = pisa, col = 'Education - Father', col_wrap = 4, height = 5)
g.map(sb.regplot, 'Average Total Score', 'Out-of-School Study Time - Commercial Company', 
      color = color2, 
      x_jitter = 0.3,
      scatter_kws = {'alpha' : 1/20}, 
      line_kws={"color": color3})
g.set_xlabels('Average Total Score')
g.set_ylabels('Out-of-School Study Time (h/week) - Commercial Company')

plt.show()
In [85]:
# Faceted heat maps on levels of father's education
g = sb.FacetGrid(data = pisa, col = 'Education - Father', col_wrap = 4, height = 5)
g.map(sb.regplot, 'Average Total Score', 'Out-of-School Study Time - With Parent', 
      color = color2, 
      x_jitter = 0.3,
      scatter_kws = {'alpha' : 1/20}, 
      line_kws={"color": color3})
g.set_xlabels('Average Total Score')
g.set_ylabels('Out-of-School Study Time (h/week) - With Parent')

plt.show()

For students Study Time with either a Commercial Company or with a Parent, we can see the same trend that we saw for Guided Homework and Personal Tutor, so the need to see it applied to Fathers educational levels, as well as the Mothers, is not necessary. Once again we can see a tiny negative correlation, indicating that if the student requires more Study Time, then it will not guarantee a higher Score.

And last but not least, the most promising variable of the Out-of-School Study Time grouping: Homework.

In [86]:
# Faceted heat maps on levels of father's education
g = sb.FacetGrid(data = pisa, col = 'Education - Father', col_wrap = 4, height = 5)
g.map(sb.regplot, 'Average Total Score', 'Out-of-School Study Time - Homework', color = color2,
      x_jitter = 0.3,
      scatter_kws = {'alpha' : 1/20}, 
      line_kws={"color": color3})
g.set_xlabels('Average Total Score')
g.set_ylabels('Out-of-School Study Time (h/week) - Homework')

plt.show()

Here we have a very clear relationship that indicates that the more time a student spends on Homework, the higher their Total Score will be. This is applicable for each educational level for the father, and it is quite a big contrast to all the other Out-of-School Study Time variables.

In [87]:
# Faceted heat maps on levels of father's education
g = sb.FacetGrid(data = pisa, col = 'Education - Mother', col_wrap = 4, height = 5)
g.map(sb.regplot, 'Average Total Score', 'Out-of-School Study Time - Homework', color = color_female,
      x_jitter = 0.3,
      scatter_kws = {'alpha' : 1/20}, 
      line_kws={"color": line})
g.set_xlabels('Average Total Score')
g.set_ylabels('Out-of-School Study Time (h/week) - Homework')

plt.show()

Just as was the case for the father, the mothers levels of education all indicate the same positive correlation between Homework related Study Time and Total Score.

As a final analysis, we can look at the fathers level of education in comparison to the three Scores that the Total Score is comprised of.

In [88]:
# Faceted heat maps on levels of fathers education
g = sb.FacetGrid(data = pisa, col = 'Education - Father', col_wrap = 4, height = 5)
g.map(sb.regplot, 'Average Science Score', 'Out-of-School Study Time - Homework', color = color2,
      x_jitter = 0.3,
      scatter_kws = {'alpha' : 1/20}, 
      line_kws={"color": color3})
g.set_xlabels('Average Science Score')
g.set_ylabels('Out-of-School Study Time (h/week) - Homework')

plt.show()
In [89]:
# Faceted heat maps on levels of father's education
g = sb.FacetGrid(data = pisa, col = 'Education - Father', col_wrap = 4, height = 5)
g.map(sb.regplot, 'Average Math Score', 'Out-of-School Study Time - Homework', color = color2,
      x_jitter = 0.3,
      scatter_kws = {'alpha' : 1/20}, 
      line_kws={"color": color3})
g.set_xlabels('Average Math Score')
g.set_ylabels('Out-of-School Study Time (h/week) - Homework')

plt.show()
In [90]:
# Faceted heat maps on levels of father's education
g = sb.FacetGrid(data = pisa, col = 'Education - Father', col_wrap = 4, height = 5)
g.map(sb.regplot, 'Average Reading Score', 'Out-of-School Study Time - Homework', color = color2,
      x_jitter = 0.3,
      scatter_kws = {'alpha' : 1/20}, 
      line_kws={"color": color3})
g.set_xlabels('Average Reading Score')
g.set_ylabels('Out-of-School Study Time (h/week) - Homework')
plt.show()

For each of these, we can see the exact same trend. There is a positive correlation between the amount of time a study puts into Homework related Study Time and the Score they receive.

Talk about some of the relationships you observed in this part of the investigation. Were there features that strengthened each other in terms of looking at your feature(s) of interest?

Throughout this section, we investigated further into what kind of effect parental education has on the scores of the students. In particular, we started off by seeing if there was a difference between the gender of a student and how well they scored, in relation to their parental level of education. For both mother's and father's level of education, we saw consistency between the genders except for very insignificant differences.

Then, we continued on to see the relationship between Learning Times and their respective subjects. As we saw in the bivariate analysis, Learning Time spent on Science had the best outcomes, but there was a catch. I will continue this topic in the question below. As for the rest, the relationship was negligible and no relationship could be established.

And finally, we looked at the relationship between the Out-of-School Study Times and the Average Total Scores. This was equally negligible for all categories except for one: Homework. We continued on to see Homework in comparison to each of the scores that the Average Total Score was composed of, and the positive correlation persisted. I would still classify it as a weak relationship, but it nevertheless was there.

Were there any interesting or surprising interactions between features?

The most notable finding was the difference between students scores across parental levels of education when comparing to Learning Time for Science. It showed that although learning time for Science seems like a variable that would increase a students score in Science, we cannot assume that it is the case for students in all circumstances. We found that for students with parents of lower educational levels, spending more time in school learning Science related topics did not have the positive correlation that we saw with the scores for the students with parents of higher educational levels. Therefore, students who spent more time learning science in school only had a visible benefit when their parents had post-secondary education or higher.